WO2019041918A1 - 一种数据编码方法、装置以及存储介质 - Google Patents

一种数据编码方法、装置以及存储介质 Download PDF

Info

Publication number
WO2019041918A1
WO2019041918A1 PCT/CN2018/088746 CN2018088746W WO2019041918A1 WO 2019041918 A1 WO2019041918 A1 WO 2019041918A1 CN 2018088746 W CN2018088746 W CN 2018088746W WO 2019041918 A1 WO2019041918 A1 WO 2019041918A1
Authority
WO
WIPO (PCT)
Prior art keywords
interval
current
coding
character
encoded
Prior art date
Application number
PCT/CN2018/088746
Other languages
English (en)
French (fr)
Inventor
吴军勇
Original Assignee
前海中科芯片控股 (深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 前海中科芯片控股 (深圳)有限公司 filed Critical 前海中科芯片控股 (深圳)有限公司
Priority to KR1020187017276A priority Critical patent/KR20190038746A/ko
Publication of WO2019041918A1 publication Critical patent/WO2019041918A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3068Precoding preceding compression, e.g. Burrows-Wheeler transformation
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6011Encoder aspects
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6064Selection of Compressor
    • H03M7/6082Selection strategies
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound
    • H03M7/705Unicode

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a data encoding method, apparatus, and storage medium.
  • Data compression refers to a technical method of reducing the amount of data to reduce the storage space and improve the efficiency of its transmission, storage and processing without losing information.
  • lossless compression refers to the use of compressed data for reconstruction (or reduction, decompression), reconstructed data and The original data is exactly the same, and lossy compression refers to the use of compressed data for reconstruction. The reconstructed data is different from the original data, but does not lead to misunderstanding of the information expressed in the original data.
  • data compression methods that is, encoding methods
  • redundancy such as binary stream
  • the embodiment of the present invention provides the following technical solutions:
  • a data encoding method comprising:
  • An encoding result is generated based on the encoded value, and the encoded result is output.
  • the embodiment of the present invention further provides the following technical solutions:
  • a data encoding device comprising:
  • An acquiring module configured to obtain a character string to be encoded, and a preset interval length, where the to-be-coded character string includes a plurality of first preset characters and second preset characters;
  • a first determining module configured to determine a coefficient threshold according to a maximum number of consecutive occurrences of the second preset character in the character string to be encoded
  • a selection module configured to select, from the preset coefficient list, a coding coefficient that is smaller than the coefficient threshold
  • a second determining module configured to determine, according to the interval length and the coding coefficient, a first coding interval corresponding to the first preset character, and a second coding interval corresponding to the second preset character;
  • An encoding module configured to encode the to-be-encoded character string by using the interval length, the coding coefficient, the first coding interval, and the second coding interval to obtain an encoded value
  • a generating module configured to generate an encoding result according to the encoded value, and output the encoded result.
  • the encoding module specifically includes:
  • a first acquiring sub-module configured to acquire a current to-be-coded character, a current coding coefficient, a current first coding interval, a current second coding interval, and a current interval length;
  • Determining a sub-module configured to determine a target interval from the current first coding interval and the current second coding interval according to the currently to-be-coded character
  • an update submodule configured to update the current first coding interval and the current second coding interval according to the current interval length, the current coding coefficient, and the target interval, to encode the current to be coded character
  • a returning module configured to use the updated first coding interval and the updated second coding interval as the current first coding interval and the current second coding interval when the coding is completed, and use the next to-be-coded character as the current to-be-coded character And returning to perform the operations of acquiring the current to-be-coded character, the current coding coefficient, the current first coding interval, the current second coding interval, and the current interval length until all the characters to be encoded are encoded.
  • update submodule is specifically configured to:
  • update submodule is specifically configured to:
  • the current first coding interval is updated by using a smaller subinterval after division, and the current second coding interval is updated by using a larger subinterval after division.
  • the return module is further configured to:
  • the next coding coefficient is taken as the current coding coefficient, and the next interval length is taken as the current interval length.
  • the encoding module further includes:
  • a second acquiring sub-module configured to acquire two endpoint values of the current second coding interval when all the characters to be encoded are encoded
  • a judging sub-module configured to determine whether the two endpoint values are the same as the highest digit of each other
  • An output submodule configured to output the same number as a target number if the judgment result is yes, and use the next bit adjacent to the highest bit as the current highest bit, and then return to perform the acquisition current second The step of encoding the two endpoint values of the interval until the determination result indicates no;
  • a sorting sub-module for sorting the target numbers in an output order to obtain an encoded value.
  • the generating module is specifically configured to:
  • the encoded value, the second quantity, and the total number are used as encoding results of the character string to be encoded.
  • the data encoding apparatus further includes a decoding module, configured to:
  • a reference character string where the reference string includes the first quantity of first preset characters and a second quantity of second preset characters, where the first quantity is equal to the total number and a difference between the second number, the highest-order character of the reference string is the second preset character;
  • the encoded value is decoded according to the reference string.
  • the decoding module is specifically configured to:
  • the decoding result is generated based on all decoded characters.
  • the decoding module is specifically configured to:
  • the first preset character is determined as the decoded character.
  • the embodiment of the present invention further provides the following technical solutions:
  • a storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in the data encoding method of any of the above.
  • the data encoding method, device, and storage medium provided by the present invention, by acquiring a character string to be encoded and a preset interval length, the to-be-coded character string includes a plurality of first preset characters and second preset characters, and Determining a coefficient threshold according to a maximum number of consecutive occurrences of the second preset character in the character string to be encoded, and then selecting, from the preset coefficient list, a coding coefficient smaller than the coefficient threshold, and according to the length of the interval And determining, by the encoding coefficient, a first encoding interval corresponding to the first preset character and a second encoding interval corresponding to the second preset character, and then using the interval length, the encoding coefficient, the first encoding interval, and the second encoding The interval encodes the to-be-encoded character string to obtain an encoded value, generates an encoded result according to the encoded value, and outputs the encoded result, so that the lossless compression of the binary data can be better achieved, the compression
  • FIG. 1a is a schematic flowchart of a data encoding method according to an embodiment of the present invention
  • FIG. 1b is a schematic flowchart of step S105 according to an embodiment of the present disclosure
  • FIG. 1c is another schematic flowchart of step S105 according to an embodiment of the present disclosure.
  • FIG. 2 is another schematic flowchart of a data encoding method according to an embodiment of the present invention.
  • FIG. 3a is a schematic structural diagram of a data encoding apparatus according to an embodiment of the present invention.
  • FIG. 3b is a schematic structural diagram of an encoding module according to an embodiment of the present disclosure.
  • FIG. 3c is another schematic structural diagram of an encoding module according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the embodiment of the invention provides a data encoding method, device, storage medium and electronic device, which will be respectively described in detail below.
  • a data encoding method includes: acquiring a character string to be encoded, and a preset interval length, wherein the to-be-coded character string includes a plurality of first preset characters and second preset characters; according to the second preset characters
  • the maximum number of consecutive occurrences of the string to be encoded determines the coefficient threshold; the coding coefficient smaller than the coefficient threshold is selected from the preset coefficient list; and the first corresponding character is determined according to the interval length and the coding coefficient a coding interval and a second coding interval corresponding to the second preset character; encoding the to-be-encoded character string by using the interval length, the coding coefficient, the first coding interval, and the second coding interval to obtain an coded value; according to the coded value
  • the encoded result is generated and the encoded result is output.
  • the specific process of the data encoding method can be as follows:
  • S101 Obtain a character string to be encoded, and a preset interval length, where the to-be-coded character string includes a plurality of first preset characters and second preset characters.
  • the to-be-encoded character string includes a binary character string
  • the first preset character may be 0, and the second preset character may be 1.
  • the preset interval length is mainly used to limit the initial space size of the encoding, which may be an artificially set 100000000000 or larger, which may be determined according to actual needs.
  • the threshold value of the coefficient corresponding to the maximum number of times can be obtained by means of a table lookup, and the size is usually only related to the number of consecutive 1s in the string to be encoded, and the number of consecutive 1 is larger, the critical value is The smaller.
  • the relationship between the coefficient value and the consecutive 1 number in the sample can be summarized by calculating the large number of samples, and then the coefficient critical value corresponding to the number of consecutive 1s is stored in a table, if necessary, directly according to The number of consecutive 1 can be obtained from the table, where the calculation is mainly through the formula Implementation, where i, j, n ⁇ [1, Len], Len is the total length of characters for each sample, p(n) > 1, T is the total statistic of all symbols in each sample, f is each In this case, the statistical value of a symbol itself, O is the cumulative statistical value of all symbols before a certain symbol, and ⁇ is the coefficient value, and the relationship between the coefficient value and one consecutive number in the sample is determined by the change of p(n).
  • the preset coefficient list may be preset, and the corresponding coefficient threshold may be calculated in advance according to different numbers of consecutive ones, and then the coefficient threshold values are used as coefficient values from small to large or The order from the largest to the smallest is stored in the preset coefficient list.
  • the coefficient value smaller than the current coefficient threshold value may be selected from the preset coefficient list first, if the selected coefficient value is selected. As a single, it can be directly used as a coding coefficient. If the selected coefficient value is multiple, one of the multiple can be selected as the coding coefficient by random or other setting.
  • S104 Determine, according to the length of the interval and the coding coefficient, a first coding interval corresponding to the first preset character, and a second coding interval corresponding to the second preset character.
  • step S105 may specifically include:
  • S1051 Acquire a current character to be encoded, a current coding coefficient, a current first coding interval, a current second coding interval, and a current interval length.
  • the character to be encoded S n , the first coding interval U′ 0 (n), the second coding interval U′ 1 (n), and the interval length R′ n are constantly changing, and each time a character is encoded, it is updated once, and If it is the first encoding, the character to be encoded is usually the first character of the character string to be encoded.
  • the coding coefficient is ⁇ 0
  • the first coding interval is U′ 0 (0)
  • the second coding interval is U. ' 1 (0)
  • the length of the interval (that is, the total length of U' 0 (0) and U' 1 (0)) is R' 0 .
  • S1052 Determine a target interval from the current first coding interval and the current second coding interval according to the current to-be-coded character.
  • step S1052 may specifically include:
  • the current second coding interval is determined as the target interval.
  • the encoding interval corresponding to the current character to be encoded S n needs to be found. For example, if the current character to be encoded Sn is 0, the target interval is U′ 0 , if the current character to be encoded S n Is 1, the target interval is U' 1 .
  • S1053 Update the current first coding interval and the current second coding interval according to the current interval length, the current coding coefficient, and the target interval, to encode the current to-be-coded character.
  • step S1053 may specifically include:
  • Len is the total length of the string to be encoded
  • L S is the number of symbol types in the string to be encoded. For example, for a binary string, since its symbol only includes 0 and 1, L S is 2, at this time, and many more.
  • the dynamic ratio r k can be calculated by the adaptive probability statistical model, that is, the ratio of the first preset character to the second preset character among the historical coded characters and the current to-be-coded characters is calculated.
  • k ⁇ f k-1 for example, for the symbol sequence 1010000110010101000100010, if the current coded character is the third character, the ratio is 1/2.
  • steps 1-3 may specifically include:
  • the current first coding interval is updated by using a smaller subinterval after division, and the current second coding interval is updated by using a larger subinterval after division.
  • U' 0 (n) [L' n-1 , L' n-1 + (f k / (f k +1)) * R' n -1]
  • U' 1 (n) [L' n-1 + (f k / (f k +1))*R' n , H' n ].
  • step S1054 When the encoding is completed, the updated first coding interval and the updated second coding interval are used as the current first coding interval and the current second coding interval, and the next to-be-coded character is used as the current to-be-coded character, and is returned.
  • the above step S1051 is performed until all the characters to be encoded are encoded.
  • the encoding process is the same for each encoding operation of the character to be encoded, that is, the encoding process of the character string to be encoded is a continuous loop process, so that the loop can be performed normally, and the encoding coefficient is involved for each loop.
  • the length of the interval should be continuously updated, that is, for each character to be encoded, when the encoding is completed, the data encoding method may further include:
  • the next coding coefficient is taken as the current coding coefficient, and the next interval length is taken as the current interval length.
  • K( ⁇ n ) ⁇ n - ⁇ n-1 , where N-1 is the number of historical coded characters.
  • step S105 may further include:
  • step S106 may specifically include:
  • the encoded value, the second quantity, and the total number are used as the encoding result of the character string to be encoded.
  • the character string to be encoded is 1010000110010101000100010
  • the total number Len is 25
  • the second number Count is 9
  • the encoding result is V, Count, Len.
  • the decoding operation is also involved, that is, the data encoding method may further include:
  • the reference string is generated according to the decoding request, where the reference string includes a first quantity of first preset characters and a second quantity of second preset characters, the first quantity being equal to the total quantity and the The difference between the two numbers, the highest-order character of the reference string is the second preset character.
  • the first quantity is 16
  • the length of the reference string is also 25 characters
  • the first character is 1, and the tail characters are all 1, that is, the reference character.
  • the initial sequence of the string is 1000000000000000011111111.
  • steps 2-3 may specifically include:
  • the decoding result is generated based on all decoded characters.
  • each character in the reference character string may be encoded according to the above coding method, and the preset interval length is also R 0 , and the initial coding coefficient is also ⁇ 0 .
  • the first coding interval D' 0 corresponding to the character 0 and the second coding interval D' 1 corresponding to the character 1 are finally obtained, except that the reference character string is encoded.
  • the current reference string is constantly changing, and each change is determined according to the previous reference string and its reference code value t n , that is, according to the previous reference string and its reference code value t n .
  • determining the decoded character according to the reference coded value and the encoded value may specifically include:
  • the first preset character is determined as the decoded character.
  • the data encoding apparatus is integrated in the electronic device, and the probability statistical model is a static probability statistical model, and the encoding coefficient is a static value example for detailed description.
  • the electronic device obtains a to-be-encoded character string, and a preset interval length, where the to-be-coded character string includes a plurality of first preset characters and second preset characters.
  • the electronic device determines a coefficient threshold according to a maximum number of consecutive occurrences of the second preset character in the character string to be encoded, and selects a coding coefficient smaller than the coefficient threshold from the preset coefficient list.
  • the maximum number of consecutive occurrences of 1 is 2, and the coefficient threshold determined from the table can be 1.153133...3, so that any coefficient belonging to (0, 1.153133...3) in the preset coefficient list can be used as
  • the electronic device determines, according to the length of the interval and the coding coefficient, a first coding interval corresponding to the first preset character and a second coding interval corresponding to the second preset character.
  • the electronic device acquires a current to-be-coded character, a current coding coefficient, a current first coding interval, a current second coding interval, and a current interval length.
  • S205 If the current to-be-coded character is the first preset character, the electronic device determines the current first coding interval as the target interval, and if the current to-be-coded character is the second preset character, the electronic device determines the current second coding interval. For the target interval.
  • the electronic device calculates a length of the next interval according to the current interval length and the current coding coefficient, and calculates an upper limit value of the interval according to the length of the next interval and the minimum endpoint value of the target interval.
  • the electronic device divides the interval between the minimum endpoint value and the upper limit value of the interval into two sub-intervals, and uses the smaller sub-interval after the division as the current sub-interval as the larger sub-interval after the division.
  • Current second coding interval is the shorter sub-interval after the division as the current sub-interval as the larger sub-interval after the division.
  • step S208 The electronic device determines whether the to-be-encoded character string is encoded. If not, the next to-be-coded character is used as the current to-be-coded character, and the next interval length is used as the current interval length, and the process returns to step S204, and if yes, The following step S209 is performed.
  • the electronic device acquires two endpoint values of the current second coding interval, and determines whether the two highest endpoints are the same as the current highest digit. If yes, perform the following step S210. If not, perform the following steps. S211.
  • the electronic device outputs the same number as the target number, and sets the next bit adjacent to the highest bit as the current highest bit, and then returns to performing the above step S209.
  • the output target number includes 7, 3, 0, 4, 2, 9.
  • the electronic device sorts the target number according to an output sequence to obtain an encoded value, and then counts a total number of characters in the to-be-coded string and a second quantity of the second preset character, and the coded value, and the second The quantity and the total number are used as the encoding result of the character string to be encoded.
  • the total number of statistics Len is 25, and the second number Count is 9.
  • the electronic device acquires a decoding request, where the decoding request carries the encoded result.
  • the electronic device generates a reference character string according to the decoding request, where the reference character string includes a first quantity of first preset characters and a second quantity of second preset characters, where the first quantity is equal to the total quantity and the number The difference between the two numbers, the highest-order character of the reference string is the second preset character.
  • the first number is 16
  • the length of the reference string is also 25 characters
  • the first character is 1, and the last character is all 1, that is, the initial of the reference string.
  • the sequence is 1000000000000000011111111.
  • the electronic device encodes the current reference character string to obtain a reference code value, and supplements the first preset character at the end of the code value, so that the coded value and the reference coded value have an equal number of characters, and then the judgment is added. Whether the encoded value is not less than the reference encoded value, and if so, the second preset character is determined as the decoded character, and if not, the first preset character is determined as the decoded character.
  • the electronic device updates the arrangement and combination of the characters in the current reference character string according to the decoded character, and uses the updated reference character string as the current reference character string, and then returns to performing the above step S214 until the accumulated value of the number of encoding times is equal to the The total number is up; the decoding result is then generated based on all decoded characters.
  • the encoding method of the reference string can refer to the encoding method of the string to be encoded, and will not be described here, the only difference is that the first corresponding to the character 0 is obtained.
  • V' ⁇ t 3 output symbol 0;
  • the 1011000000000000000111111 is adjusted to obtain the reference string 1010100000000000000111111, and the decoding is continued with 1010100000000000000111111.
  • Table 2 is the result of the above decoding process. It can be seen from the following table that the decoding result is 1010000110010101000100010, that is, the sequence to be encoded can be completely obtained, that is, the above coding method is a lossless compression method, and the compression method is used. It is mainly applied to data types such as word that require high degree of reduction.
  • the data encoding method provided in this embodiment, wherein the electronic device can obtain a character string to be encoded and a preset interval length, where the to-be-coded character string includes a plurality of first preset characters and second preset characters.
  • the operation of the two endpoint values of the interval, if not, the target number is sorted according to the output order. Go to the encoded value, and then count the total number of characters in the string to be encoded and the second number of the second preset characters, and use the encoded value, the second quantity, and the total number as the encoding result of the character string to be encoded. Therefore, the encoding operation of the binary string can be better implemented, and the compression capability is strong.
  • the electronic device can obtain a decoding request, and the decoding request carries the encoding result, and then generates a reference string according to the decoding request, where the reference string is The first number of first preset characters and the second number of second preset characters are included, the first quantity is equal to the difference between the total quantity and the second quantity, and the highest digit of the reference character string is the second pre- Setting a character, and then encoding the current reference character string to obtain a reference code value, and then supplementing the first preset character at the end of the code value so that the coded value and the reference coded value have an equal number of characters, and determining Whether the added coded value is not less than the reference code value, and if so, the second preset character is determined as the decoded character, and if not, the first preset is The character is determined as a decoded character, and then, according to the decoded character, the arrangement and combination of the characters in the current reference character string is updated, and the updated reference character string is used as the
  • the operation of obtaining the reference code value is performed until the accumulated value of the number of times of encoding is equal to the total number. Finally, the decoding result is generated according to all the decoded characters, thereby implementing lossless compression of the binary string, and the method is simple and flexible.
  • the present embodiment will be further described from the perspective of the data encoding apparatus according to the method described in the foregoing embodiments.
  • the data encoding apparatus may be implemented as an independent entity or integrated in an electronic device such as a terminal or a server.
  • the electronic device can include a smartphone, a tablet, a personal computer, and the like.
  • FIG. 3a specifically describes a data encoding apparatus according to an embodiment of the present invention, which may include: an obtaining module 10, a first determining module 20, a selecting module 30, a second determining module 40, an encoding module 50, and a generating module. 60, of which:
  • the obtaining module 10 is configured to obtain a character string to be encoded, and a preset interval length, where the to-be-coded character string includes a plurality of first preset characters and second preset characters.
  • the to-be-encoded character string includes a binary character string
  • the first preset character may be 0, and the second preset character may be 1.
  • the preset interval length is mainly used to limit the initial space size of the encoding, which may be an artificially set 100000000000 or larger, which may be determined according to actual needs.
  • the first determining module 20 is configured to determine a coefficient threshold according to a maximum number of consecutive occurrences of the second preset character in the character string to be encoded.
  • the threshold value of the coefficient corresponding to the maximum number of times can be obtained by means of a table lookup, and the size is usually only related to the number of consecutive 1s in the string to be encoded, and the number of consecutive 1 is larger, the critical value is The smaller.
  • the relationship between the coefficient value and the consecutive 1 number in the sample can be summarized by calculating the large number of samples, and then the coefficient critical value corresponding to the number of consecutive 1s is stored in a table, if necessary, directly according to The number of consecutive 1 can be obtained from the table, where the calculation is mainly through the formula Implementation, where i, j, n ⁇ [1, Len], Len is the total length of characters for each sample, p(n) > 1, T is the total statistic of all symbols in each sample, f is each In this case, the statistical value of a symbol itself, O is the cumulative statistical value of all symbols before a certain symbol, and ⁇ is the coefficient value, and the relationship between the coefficient value and one consecutive number in the sample is determined by the change of p(n).
  • the selecting module 30 is configured to select, from the preset coefficient list, a coding coefficient that is less than the coefficient threshold.
  • the preset coefficient list may be preset, and the corresponding coefficient threshold may be calculated in advance according to different numbers of consecutive ones, and then the coefficient threshold values are used as coefficient values from small to large or The order from the largest to the smallest is stored in the preset coefficient list.
  • the coefficient value smaller than the current coefficient threshold value may be selected from the preset coefficient list first, if the selected coefficient value is selected. As a single, it can be directly used as a coding coefficient. If the selected coefficient value is multiple, one of the multiple can be selected as the coding coefficient by random or other setting.
  • the second determining module 40 is configured to determine, according to the length of the interval and the encoding coefficient, a first encoding interval corresponding to the first preset character and a second encoding interval corresponding to the second preset character.
  • the encoding module 50 is configured to encode the to-be-encoded character string by using the interval length, the coding coefficient, the first coding interval, and the second coding interval to obtain an encoded value.
  • the encoding module 50 may specifically include:
  • the first obtaining sub-module 51 is configured to acquire a current to-be-coded character, a current encoding coefficient, a current first encoding interval, a current second encoding interval, and a current interval length.
  • the character to be encoded S n , the first coding interval U′ 0 (n), the second coding interval U′ 1 (n), and the interval length R′ n are constantly changing, and each time a character is encoded, it is updated once, and If it is the first encoding, the character to be encoded is usually the first character of the character string to be encoded.
  • the coding coefficient is ⁇ 0
  • the first coding interval is U′ 0 (0)
  • the second coding interval is U. ' 1 (0)
  • the length of the interval (that is, the total length of U' 0 (0) and U' 1 (0)) is R' 0 .
  • the determining sub-module 52 is configured to determine a target interval from the current first coding interval and the current second coding interval according to the current to-be-coded character.
  • the determining sub-module 52 can be specifically used to:
  • the current second coding interval is determined as the target interval.
  • the encoding interval corresponding to the current character to be encoded S n needs to be found. For example, if the current character to be encoded Sn is 0, the target interval determined by the determining sub-module 52 is U′ 0 . The current to-be-coded character S n is 1, and it is determined that the target interval determined by the sub-module 52 is U' 1 .
  • the update sub-module 53 is configured to update the current first coding interval and the current second coding interval according to the current interval length, the current coding coefficient, and the target interval, to encode the current to-be-coded character;
  • update submodule 53 can be specifically used to:
  • Len is the total length of the string to be encoded
  • L S is the number of symbol types in the string to be encoded. For example, for a binary string, since its symbol only includes 0 and 1, L S is 2, at this time, and many more.
  • the dynamic ratio r k can be calculated by the adaptive probability statistical model, that is, the ratio of the first preset character to the second preset character among the historical coded characters and the current to-be-coded characters is calculated.
  • k ⁇ f k-1 for example, for the symbol sequence 1010000110010101000100010, if the current coded character is the third character, the ratio is 1/2.
  • steps 1-3 may specifically include:
  • the current first coding interval is updated by using a smaller subinterval after division, and the current second coding interval is updated by using a larger subinterval after division.
  • U' 0 (n) [L' n-1 , L' n-1 + (f k / (f k +1)) * R' n -1]
  • U' 1 (n) [L' n-1 + (f k / (f k +1))*R' n , H' n ].
  • the returning module 54 is configured to: when the encoding is completed, use the updated first encoding interval and the updated second encoding interval as the current first encoding interval and the current second encoding interval, and use the next to-be-coded character as the current to be encoded.
  • the character returns the operation of acquiring the current character to be encoded, the current coding coefficient, the current first coding interval, the current second coding interval, and the current interval length until all the characters to be encoded are encoded.
  • the encoding process is the same for each encoding operation of the character to be encoded, that is, the encoding process of the character string to be encoded is a continuous loop process, in order to make the loop work normally, the encoding is involved for each loop.
  • the coefficients and interval lengths should be continually updated, that is, the return module 54 can also be used to:
  • the next coding coefficient is taken as the current coding coefficient, and the next interval length is taken as the current interval length.
  • K( ⁇ n ) ⁇ n - ⁇ n-1 , where N-1 is the number of historical coded characters.
  • the encoding module 50 may further include:
  • a second obtaining sub-module 55 configured to acquire two endpoint values of the current second coding interval when all the characters to be encoded are encoded
  • a judging sub-module 56 configured to determine whether the two highest end values of the two endpoint values are the same as each other;
  • the output sub-module 57 is configured to output the same number as the target number if the determination result is yes, and use the next bit adjacent to the highest bit as the current highest bit, and then return to perform the acquisition of the current second code. The step of the two endpoint values of the interval until the judgment result indicates no;
  • the sorting sub-module 58 is configured to sort the target numbers according to the output order to obtain an encoded value.
  • the generating module 60 is configured to generate an encoding result according to the encoded value, and output the encoded result.
  • the generating module 60 can be specifically configured to:
  • the encoded value, the second quantity, and the total number are used as the encoding result of the character string to be encoded.
  • the character string to be encoded is 1010000110010101000100010
  • the total number Len is 25
  • the second number Count is 9
  • the encoding result is V, Count, Len.
  • the data encoding apparatus may further include a decoding module, configured to:
  • a reference character string where the reference string includes a first quantity of first preset characters and a second quantity of second preset characters, where the first quantity is equal to a difference between the total quantity and the second quantity
  • the highest-order character of the reference string is the second preset character
  • the encoded value is decoded according to the reference string.
  • the reference string is generated according to the decoding request, where the reference string includes a first quantity of first preset characters and a second quantity of second preset characters, the first quantity being equal to the total quantity and the The difference between the two numbers, the highest-order character of the reference string is the second preset character.
  • the first quantity is 16
  • the length of the reference string is also 25 characters
  • the first character is 1, and the tail characters are all 1, that is, the reference character.
  • the initial sequence of the string is 1000000000000000011111111.
  • the decoding module can further be used to:
  • the decoding result is generated based on all decoded characters.
  • each character in the reference character string may be encoded according to the above coding method, and the preset interval length is also R 0 , and the initial coding coefficient is also ⁇ 0 .
  • the first coding interval D' 0 corresponding to the character 0 and the second coding interval D' 1 corresponding to the character 1 are finally obtained, except that the reference character string is encoded.
  • the current reference string is constantly changing, and each change is determined according to the previous reference string and its reference code value t n , that is, according to the previous reference string and its reference code value t n .
  • the decoding module can be used to:
  • the first preset character is determined as the decoded character.
  • the foregoing units may be implemented as a separate entity, or may be implemented in any combination, and may be implemented as the same or a plurality of entities.
  • the foregoing method embodiments and details are not described herein.
  • the data encoding apparatus obtains the to-be-encoded character string and the preset interval length by the obtaining module 10, and the to-be-coded character string includes a plurality of first preset characters and second preset characters.
  • the first determining module 20 determines a coefficient threshold according to the maximum number of consecutive occurrences of the second preset character in the character string to be encoded, and then the selecting module 30 selects a coding coefficient smaller than the coefficient threshold from the preset coefficient list.
  • the second determining module 40 determines the first encoding interval corresponding to the first preset character and the second encoding interval corresponding to the second preset character according to the length of the interval and the encoding coefficient, and then the encoding module 50 uses the length of the interval, The coding coefficient, the first coding interval and the second coding interval encode the to-be-encoded character string to obtain an encoded value, and the generating module 60 generates an encoding result according to the encoded value, and outputs the encoded result, so that the binary data can be better implemented. Lossless compression, strong compression, and good compression.
  • an embodiment of the present invention further provides an electronic device, as shown in FIG. 4, which is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the electronic device can include a processor 701 of one or more processing cores, a memory 702 of one or more computer readable storage media, a power source 703, and an input unit 704. It will be understood by those skilled in the art that the electronic device structure illustrated in FIG. 4 does not constitute a limitation to the electronic device, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:
  • the processor 701 is the control center of the electronic device, connecting various portions of the entire electronic device using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 702, and recalling stored in the memory 702. Data, performing various functions and processing data of the electronic device, thereby performing overall monitoring of the electronic device.
  • the processor 701 may include one or more processing cores; preferably, the processor 701 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 701.
  • the memory 702 can be used to store software programs and modules, and the processor 701 executes various functional applications and data processing by running software programs and modules stored in the memory 702.
  • the memory 702 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of electronic devices, etc.
  • memory 702 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 702 can also include a memory controller to provide processor 701 access to memory 702.
  • the electronic device also includes a power source 703 that supplies power to the various components.
  • the power source 703 can be logically coupled to the processor 701 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the power supply 703 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the electronic device can also include an input unit 704 that can be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
  • an input unit 704 can be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
  • the electronic device may further include a display unit or the like, which will not be described herein.
  • the processor 701 in the electronic device loads the executable file corresponding to the process of one or more applications into the memory 702 according to the following instructions, and is stored and stored by the processor 701.
  • An encoding result is generated based on the encoded value, and the encoded result is output.
  • the electronic device can implement the effective effects of any of the data encoding devices provided by the embodiments of the present invention. For details, refer to the previous embodiments, and details are not described herein.
  • an embodiment of the present invention provides a storage medium in which a plurality of instructions are stored, which can be loaded by a processor to perform the steps in any of the data encoding methods provided by the embodiments of the present invention.
  • the storage medium may include: a read only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
  • ROM read only memory
  • RAM random access memory
  • magnetic disk a magnetic disk or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明公开了一种数据编码方法、装置以及存储介质,包括:获取待编码字符串;根据第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值,并选择出小于该系数临界值的编码系数;根据预设的区间长度和编码系数确定第一编码区间以及第二编码区间;对该待编码字符串编码得到编码值;根据该编码值生成并输出编码结果。

Description

一种数据编码方法、装置以及存储介质
本申请要求于2017年8月30日提交中国专利局、申请号为201710765880.5、发明名称为“一种数据编码方法、装置以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,尤其涉及一种数据编码方法、装置以及存储介质。
背景技术
数据压缩是指在不丢失信息的前提下,缩减数据量以减少存储空间,提高其传输、存储和处理效率的一种技术方法。
现有的数据压缩技术包括有损压缩和无损压缩,对于压缩后的数据还原来说,无损压缩是指使用压缩后的数据进行重构(或者叫做还原,解压缩),重构后的数据与原来的数据完全相同,而有损压缩是指使用压缩后的数据进行重构,重构后的数据与原来的数据有所不同,但不会导致对原始资料表达的信息的误解。目前,数据压缩的方式非常多,不同特点的数据具有不同的数据压缩方式(也就是编码方式),但针对于冗余度比较低的数据,比如二进制流,利用现有的压缩方法压缩后,其缩减的数据量有限,压缩效果差。
发明内容
本发明的目的在于提供一种数据编码方法、装置以及存储介质,以解决现有数据压缩方法缩减的数据量有限,压缩效果差的技术问题。
为解决上述技术问题,本发明实施例提供以下技术方案:
一种数据编码方法,包括:
获取待编码字符串、以及预设的区间长度,所述待编码字符串中包括多个第一预设字符和第二预设字符;
根据所述第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;
从预设系数列表中选择出小于所述系数临界值的编码系数;
根据所述区间长度和编码系数确定所述第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间;
利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值;
根据所述编码值生成编码结果,并输出所述编码结果。
为解决上述技术问题,本发明实施例还提供以下技术方案:
一种数据编码装置,包括:
获取模块,用于获取待编码字符串、以及预设的区间长度,所述待编码字符串中包括多个第一预设字符和第二预设字符;
第一确定模块,用于根据所述第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;
选择模块,用于从预设系数列表中选择出小于所述系数临界值的编码系数;
第二确定模块,用于根据所述区间长度和编码系数确定所述第一预设字符对应的第一编码区间、以 及第二预设字符对应的第二编码区间;
编码模块,用于利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值;
生成模块,用于根据所述编码值生成编码结果,并输出所述编码结果。
进一步地,所述编码模块具体包括:
第一获取子模块,用于获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度;
确定子模块,用于根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间;
更新子模块,用于根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,以对当前待编码字符进行编码;
返回模块,用于当编码完成时,将更新后的第一编码区间和更新后的第二编码区间作为当前第一编码区间和当前第二编码区间,将下一待编码字符作为当前待编码字符,并返回执行获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度的操作,直至所有待编码字符编码完毕。
进一步地,所述更新子模块具体用于:
根据当前区间长度和当前编码系数计算下一区间长度;
获取历史已编码字符,并计算所述历史已编码字符和当前待编码字符之中,所述第一预设字符与第二预设字符的比值;
根据所述比值、下一区间长度和目标区间的最小端点值对当前第一编码区间和当前第二编码区间进行更新。
进一步地,所述更新子模块具体用于:
根据所述下一区间长度和最小端点值计算区间上限值;
利用所述比值对所述最小端点值和区间上限值之间的区间进行划分,得到两个子区间;
利用划分之后较小的子区间对当前第一编码区间进行更新,利用划分之后较大的子区间对当前第二编码区间进行更新。
进一步地,所述返回模块还用于:
当编码完成时,统计所述历史已编码字符的个数;
根据历史已编码字符的个数和当前编码系数计算下一编码系数;
将下一编码系数作为当前编码系数,将下一区间长度作为当前区间长度。
进一步地,所述编码模块还包括:
第二获取子模块,用于当所有待编码字符编码完毕时,获取当前第二编码区间的两个端点值;
判断子模块,用于判断所述两个端点值彼此间当前最高位的数字是否相同;
输出子模块,用于若判断结果指示是,则将所述相同的数字作为目标数字进行输出,并将与所述最高位相邻的下一位作为当前最高位,之后返回执行获取当前第二编码区间的两个端点值的步骤,直至判断结果指示否;
排序子模块,用于按照输出顺序对所述目标数字进行排序,得到编码值。
进一步地,所述生成模块具体用于:
统计所述待编码字符串中字符的总数量、以及第二预设字符的第二数量;
将所述编码值、第二数量和总数量作为所述待编码字符串的编码结果。
进一步地,该数据编码装置还包括解码模块,用于:
在输出编码结果之后,获取解码请求,所述解码请求携带所述编码结果;
根据所述解码请求生成参考字符串,所述参考字符串中包括所述第一数量个第一预设字符、以及第二数量个第二预设字符,所述第一数量等于所述总数和第二数量之差,所述参考字符串的最高位字符为所述第二预设字符;
根据所述参考字符串对所述编码值进行解码。
进一步地,该解码模块具体用于:
对当前参考字符串进行编码,得到参考编码值;
根据所述参考编码值和编码值确定解码字符,并根据所述解码字符对当前参考字符串中字符的排列组合进行更新;
将更新后的参考字符串作为当前参考字符串,并返回执行对当前参考字符串进行编码的步骤,直至编码次数的累计值等于所述总数量为止;
根据所有解码字符生成解码结果。
进一步地,该解码模块具体用于:
在所述编码值的尾部补充第一预设字符,以使所述编码值和参考编码值具有相等数量的字符;
判断补充后的编码值是否不小于参考编码值;
若是,则将第二预设字符确定为解码字符;
若否,则将第一预设字符确定为解码字符。
为解决上述技术问题,本发明实施例还提供以下技术方案:
一种存储介质,所述存储介质存储有多条指令,所述指令适于处理器进行加载,以执行上述任一项所述的数据编码方法中的步骤。
本发明提供的数据编码方法、装置以及存储介质,通过获取待编码字符串、以及预设的区间长度,所述待编码字符串中包括多个第一预设字符和第二预设字符,并根据所述第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值,之后,从预设系数列表中选择出小于所述系数临界值的编码系数,并根据所述区间长度和编码系数确定所述第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间,之后,利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值,根据所述编码值生成编码结果,并输出所述编码结果,从而可以较好的实现二进制数据的无损压缩,压缩能力强,压缩效果好。
附图说明
下面结合附图,通过对本发明的具体实施方式详细描述,将使本发明的技术方案及其它有益效果显而易见。
图1a为本发明实施例提供的数据编码方法的流程示意图;
图1b为本发明实施例提供的步骤S105的流程示意图;
图1c为本发明实施例提供的步骤S105的另一流程示意图;
图2为本发明实施例提供的数据编码方法的另一流程示意图;
图3a为本发明实施例提供的数据编码装置的结构示意图;
图3b为本发明实施例提供的编码模块的结构示意图;
图3c为本发明实施例提供的编码模块的另一结构示意图;
图4为本发明实施例提供的电子设备的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例提供一种数据编码方法、装置、存储介质以及电子设备,以下将分别进行详细说明。
一种数据编码方法,包括:获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符;根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;从预设系数列表中选择出小于该系数临界值的编码系数;根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间;利用该区间长度、编码系数、第一编码区间和第二编码区间对该待编码字符串进行编码,得到编码值;根据该编码值生成编码结果,并输出该编码结果。
如图1a所示,该数据编码方法的具体流程可以如下:
S101、获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符。
本实施例中,该待编码字符串包括二进制字符串,该第一预设字符可以为0,该第二预设字符可以为1。该预设的区间长度主要用于限定编码的初始空间大小,其可以是人为设定的100000000000,或者更大,具体可以根据实际需求而定。
S102、根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值。
本实施例中,可以通过查表的方式获取与该最大次数对应的系数临界值,其大小通常只与该待编码字符串中连续1的个数有关,连续1的个数越多,临界值越小。实际应用过程中,可以通过对大量样本的计算归纳出系数值与样本中连续1个数的关系,之后将不同连续1的个数对应的系数临界值存储在一个表格上,需要时,直接根据连续1的个数从表格中获取对应值即可,其中,该计算主要通过公式
Figure PCTCN2018088746-appb-000001
实现,其中,i,j,n∈[1,Len],Len为每一样本的字符总长度,p(n)>1,T为每一样本中所有符号的总统计值,f为每一样本中某个符号自己的统计值,O为某个符号之前的所有符号的累计统计值,α为系数值,通过p(n)的变化确定系数值与样本中连续1个数的关系。
S103、从预设系数列表中选择出小于该系数临界值的编码系数。
本实施例中,该预设系数列表可以是预先设定好的,其可以根据连续1的不同个数提前计算出对应的系数临界值,然后将这些系数临界值作为系数值按照从小到大或者从大到小的顺序存储在预设系数列 表中,这样,当需要选出编码系数时,可以先从预设系数列表中选出小于当前的系数临界值的系数值,若选出的系数值为单个,可以直接作为编码系数,若选出的系数值为多个,可以通过随机或者其他设定方式从多个中选择一个作为编码系数。
S104、根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间。
本实施例中,可以先通过公式R′ 0=R 00对区间长度初始化,其中,R′ 0为初始化后的区间长度,R 0为预设的区间长度,α 0为编码系数,比如若α 0为1.1,R 0为100000000000,则初始化后的区间长度R′ 0为110000000000,此时,初始化区间可以为[0,110000000000],之后,对该初始化区间进行划分,得到较小的第一编码区间和第二编码区间。通常,对于二进制字符串,其划分方式可以是等分,也即该第一编码区间U′ 0=[0,54999999999],该第二编码区间U′ 1=[55000000000,110000000000],且U′ 0为字符0对应的区间,U′ 1为字符1对应的区间,其中,U′ 0和U′ 1的区间长度为R′ 0/2。
S105、利用该区间长度、编码系数、第一编码区间和第二编码区间对该待编码字符串进行编码,得到编码值。
例如,请参见图1b,上述步骤S105具体可以包括:
S1051、获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度。
本实施例中,该编码系数α n可以是不断变化的,也可以是个定值,当为定值时,其可以是α 0,当为非定值时,可以通过K(α n)=α nn-1,其中K(α n)可以是一个指定的定性函数,如加、乘、逻辑(重要或不重要)等。该待编码字符S n、第一编码区间U′ 0(n)、第二编码区间U′ 1(n)和区间长度R′ n是不断变化的,每编码完一个字符,就更新一次,且若为首次编码,该待编码字符通常为该待编码字符串的首字符,此时,该编码系数为α 0,该第一编码区间为U′ 0(0),该第二编码区间为U′ 1(0),该区间长度(也即U′ 0(0)和U′ 1(0)的总长度)为R′ 0
S1052、根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间。
例如,上述步骤S1052具体可以包括:
判断当前待编码字符是否为该第一预设字符;
若是,则将当前第一编码区间确定为目标区间;
若否,则将当前第二编码区间确定为目标区间。
本实施例中,在编码过程中,需要找到当前待编码字符S n对应的编码区间,比如,若当前待编码字符S n为0,则目标区间为U′ 0,若当前待编码字符S n为1,则目标区间为U′ 1
S1053、根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,以对当前待编码字符进行编码。
例如,上述步骤S1053具体可以包括:
1-1、根据当前区间长度和当前编码系数计算下一区间长度。
本实施例中,
Figure PCTCN2018088746-appb-000002
Len为待编码字符串的总长度,L S为待编码字符串中符号类型个数。例如,对于二进制字符串,由于其符号只包括0 和1,故L S为2,此时,
Figure PCTCN2018088746-appb-000003
Figure PCTCN2018088746-appb-000004
等等。
1-2、获取历史已编码字符,并计算该历史已编码字符和当前待编码字符之中,该第一预设字符与第二预设字符的比值。
本实施例中,可以通过自适应概率统计模型来计算动态的比值f k,也即时刻计算历史已编码字符和当前待编码字符之中第一预设字符与第二预设字符之比,f k≠f k-1,比如,对于符号序列1010000110010101000100010,若当前编码字符为第三个字符,则该比值为1/2。
当然,也可以通过静态统计模型来计算静态的比值f k,也即直接将该比值定义为静态值(比如f k=1),f k=f k-1
1-3、根据该比值、下一区间长度和目标区间的最小端点值对当前第一编码区间和当前第二编码区间进行更新。
例如,上述步骤1-3具体可以包括:
根据该下一区间长度和最小端点值计算区间上限值;
利用该比值对该最小端点值和区间上限值之间的区间进行划分,得到两个子区间;
利用划分之后较小的子区间对当前第一编码区间进行更新,利用划分之后较大的子区间对当前第二编码区间进行更新。
本实施例中,区间上限值H′ n=L′ n-1+R′ n,L′ n-1为目标区间的最小端点,在自适应概率统计模型中,U′ 0(n)=[L′ n-1,L′ n-1+(f k/(f k+1))*R′ n-1],U′ 1(n)=[L′ n-1+(f k/(f k+1))*R′ n,H′ n]。在静态概率统计模型中,U′ 0(n)=[L′ n-1,L′ n-1+R′ n/2-1],U′ 1(n)=[L′ n-1+R′ n/2,H′ n]。
S1054、当编码完成时,将更新后的第一编码区间和更新后的第二编码区间作为当前第一编码区间和当前第二编码区间,将下一待编码字符作为当前待编码字符,并返回执行上述步骤S1051,直至所有待编码字符编码完毕。
本实施例中,当n=Len时,整个待编码字符串编码完毕,此时,会得到U′ 1(n)和U′ 0(n)。
当然,针对于每个待编码字符的编码操作,其编码流程是相同的,也即该待编码字符串的编码过程是个不断循环的过程,为使循环能正常进行,对于每次循环涉及编码系数和区间长度应不断更新,也即对于每个待编码字符,当编码完成时,该数据编码方法还可以包括:
统计该历史已编码字符的个数;
根据历史已编码字符的个数和当前编码系数计算下一编码系数;
将下一编码系数作为当前编码系数,将下一区间长度作为当前区间长度。
本实施例中,该编码系数α n可以是静态值,比如α n=α 0,也可以是动态值,当为动态值时,K(α n)=α nn-1,其中,n-1为历史已编码字符的个数,当此次编码操作完成时,对于下次编码操作,其编码系数为α n,区间长度为R′ n
此外,由于所有待编码字符编码完毕时,得到的是U′ 1(n)和U′ 0(n),而该待编码字符串的编码值可以是U′ 1(n)和U′ 0(n)中的任意一个值,因此,请参见图1c,上述步骤S105还可以包括:
S1055、当所有待编码字符编码完毕时,获取当前第二编码区间的两个端点值;
S1056、判断该两个端点值彼此间当前最高位的数字是否相同;
S1057、若判断结果指示是,则将该相同的数字作为目标数字进行输出,并将与该最高位相邻的下一位作为当前最高位,之后返回执行S1055,直至判断结果指示否;
S1058、按照输出顺序对该目标数字进行排序,得到编码值。
本实施例中,若U′ 1(n)=[73042919870,73042952160],则编码值V为730429。
S106、根据该编码值生成编码结果,并输出该编码结果。
例如,上述步骤S106具体可以包括:
统计该待编码字符串中字符的总数量、以及第二预设字符的第二数量;
将该编码值、第二数量和总数量作为该待编码字符串的编码结果。
本实施例中,若待编码字符串为1010000110010101000100010,则总数量Len为25,第二数量Count为9,编码结果为V,Count,Len。
此外,在输出编码结果之后,还涉及到解码操作,也即,该数据编码方法还可以包括:
2-1、获取解码请求,该解码请求携带该编码结果;
2-2、根据该解码请求生成参考字符串,该参考字符串中包括第一数量个第一预设字符、以及第二数量个第二预设字符,该第一数量等于该总数量和第二数量之差,该参考字符串的最高位字符为该第二预设字符。
本实施例中,对于Count=9,Len=25,则第一数量为16,该参考字符串的长度也为25个字符,且首字符为1,尾字符全部为1,也即该参考字符串的初始序列为1000000000000000011111111。
2-3、根据该参考字符串对该编码值进行解码。
例如,上述步骤2-3具体可以包括:
对当前参考字符串进行编码,得到参考编码值;
根据该参考编码值和编码值确定解码字符,并根据该解码字符对当前参考字符串中字符的排列组合进行更新;
将更新后的参考字符串作为当前参考字符串,并返回执行对当前参考字符串进行编码的步骤,直至编码次数的累计值等于该总数量为止;
根据所有解码字符生成解码结果。
本实施例中,对于每个参考字符串,均可以按照上述编码方法对参考字符串中的每一字符进行编码,其预设的区间长度也为R 0,其初始的编码系数也为α 0,当参考字符串中的全部字符都编码完毕时,最终得到字符0对应的第一编码区间D′ 0和字符1对应的第二编码区间D′ 1,不同的是,该参考字符串的编码值(也即参考编码值)并非高位相同的数值,其可以是先直接取D′ 0的最小端点值作为T n,然后根据指定函数t n=T ny(n)来计算参考编码值t n,其中函数y(n)是通过试验得出的,其与α n有关,之后,根据参考编码值t n和编码值V确定解码字符,该解码字符主要包括1和0。当编码次数达到Len时,可以将所有解码字符按照确定顺序排列,最终得到的字符序列也即解码结果。
需要说明的是,该当前参考字符串是不断变化的,每次变化是根据上一参考字符串及其参考编码值t n而定,也即根据上一参考字符串及其参考编码值t n对当前参考字符串中字符1和0的排列位置进行重新调整。该调整主要涉及字符1的移动,比如,若上一参考字符串为1100000000000000001111111,当解码 字符为1时,需要将1100000000000000001111111中尾部最高位的字符1移动到前端最低位的字符1的后面,得到1110000000000000000111111(也即更新后的参考字符串),当解码字符为0时,需要将1100000000000000001111111中前端最低位的字符1往后移动一位,得到1010000000000000001111111(也即更新后的参考字符串)。
进一步地,上述“根据该参考编码值和编码值确定解码字符”具体可以包括:
在该编码值的尾部补充第一预设字符,以使该编码值和参考编码值具有相等数量的字符;
判断补充后的编码值是否不小于参考编码值;
若是,则将第二预设字符确定为解码字符;
若否,则将第一预设字符确定为解码字符。
本实施例中,当V=730429时,对于第n次编码操作,若参考编码值t n=85252570554,则V补0之后可以变为V′=73042900000,此时,t n>V′,得到的解码字符为0,若参考编码值t n=55004691494,此时,t n≤V′,得到的解码字符为1。
根据上述实施例所描述的方法,以下将举例作进一步详细说明。
在本实施例中,将以数据编码装置集成在电子设备中,该概率统计模型为静态概率统计模型,该编码系数为静态值为例进行详细说明。
如图2所示,一种数据编码方法,具体流程可以如下:
S201、电子设备获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符。
譬如,该待编码字符串可以为1010000110010101000100010,该预设的区间长度R 0=100000000000,该第一预设字符为0,该第二预设字符为1。
S202、电子设备根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值,并从预设系数列表中选择出小于该系数临界值的编码系数。
譬如,对于1010000110010101000100010,1连续出现的最大次数为2,从表中确定出的系数临界值可以为1.153133…3,从而可以将预设系数列表中属于(0,1.153133…3)的任一系数作为编码系数α 0,比如选择α 0=1.1。
S203、电子设备根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间。
譬如,可以先通过公式R′ 0=R 00对区间长度初始化,得到初始化后的区间长度R′ 0=110000000000,之后通过等分的方式,得到第一编码区间U′ 0(0)=[0,54999999999],第二编码区间U′ 1(0)=[55000000000,110000000000]。
S204、电子设备获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度。
S205、若当前待编码字符为该第一预设字符,电子设备将当前第一编码区间确定为目标区间,若当前待编码字符为该第二预设字符,电子设备将当前第二编码区间确定为目标区间。
S206、电子设备根据当前区间长度和当前编码系数计算下一区间长度,并根据该下一区间长度和目 标区间的最小端点值计算区间上限值。
S207、电子设备对该最小端点值和区间上限值之间的区间进行等分,得到两个子区间,将划分之后较小的子区间作为当前第一编码区间将划分之后较大的子区间作为当前第二编码区间。
S208、电子设备判断该待编码字符串是否编码完毕,若否,则将下一待编码字符作为当前待编码字符,将下一区间长度作为当前区间长度,并返回执行上述步骤S204,若是,则执行下述步骤S209。
譬如,α n=α 0=1.1,L S=2,f k=1,k∈[1,L S],待编码字符串为1010000110010101000100010,整个编码过程如下:
获取第1个待编码字符1,R′ 1/2=30250000000,使用并调整区间U′ 1(0),得到U′ 0(1)=[55000000000,85249999999],U′ 1(1)=[85250000000,115500000000]。
获取第2个待编码字符0,R′ 2/2=16637500000;使用并调整区间U′ 0(1),得到U′ 0(2)=[55000000000,71637499999],U′ 1(2)=[71637500000,88275000000]
获取第3个待编码字符1,R′ 3/2=9150625000;使用并调整区间U′ 1(2),得到U′ 0(3)=[71637500000,80788124999],U′ 1(3)=[80788125000,89938750000]。
获取第4个待编码字符0,R′ 4/2=5032843750;使用并调整区间U′ 0(3),得到U′ 0(4)=[71637500000,76670343749],U′ 1(4)=[76670343750,81703187500],
以此类推,得到下表表1:
符号 R′ n/2 符号0的L 符号0的H 符号1的L 符号1的H
\ 100000000000 \ \ \ \
1 55000000000 0 54999999999 55000000000 110000000000
0 30250000000 55000000000 85249999999 85250000000 115500000000
1 16637500000 55000000000 71637499999 71637500000 88275000000
0 9150625000 71637500000 80788124999 80788125000 89938750000
0 5032843750 71637500000 76670343749 76670343750 81703187500
0 2768064063 71637500000 74405564062 74405564063 77173628125
0 1522435234 71637500000 73159935233 73159935234 74682370469
1 837339379 71637500000 72474839378 72474839379 73312178758
1 460536658 72474839379 72935376036 72935376037 73395912696
0 253295162 72935376037 73188671198 73188671199 73441966362
0 139312339 72935376037 73074688375 73074688376 73214000716
1 76621787 72935376037 73011997823 73011997824 73088619610
0 42141983 73011997824 73054139805 73054139806 73096281789
1 23178090 73011997824 73035175913 73035175914 73058354005
0 12747950 73035175914 73047923863 73047923864 73060671814
1 7011372 73035175914 73042187286 73042187287 73049198659
0 3856255 73042187287 73046043540 73046043541 73049899796
0 2120940 73042187287 73044308226 73044308227 73046429167
0 1166517 73042187287 73043353803 73043353804 73044520321
1 641584 73042187287 73042828870 73042828871 73043470455
0 352871 73042828871 73043181741 73043181742 73043534614
0 194079 73042828871 73043022949 73043022950 73043217030
0 106744 73042828871 73042935614 73042935615 73043042358
1 58709 73042828871 73042887579 73042887580 73042946289
0 32290 73042887580 73042919869 73042919870 73042952160
表1
S209、电子设备获取当前第二编码区间的两个端点值,并判断该两个端点值彼此间当前最高位的数字是否相同,若是,则执行下述步骤S210,若否,则执行下述步骤S211。
S210、电子设备将该相同的数字作为目标数字进行输出,并将与该最高位相邻的下一位作为当前最高位,之后返回执行上述步骤S209。
譬如,对于U′ 1(25)=[73042919870,73042952160],输出的目标数字包括7,3,0,4,2,9。
S211、电子设备按照输出顺序对该目标数字进行排序,得到编码值,之后统计该待编码字符串中字符的总数量、以及第二预设字符的第二数量,并将该编码值、第二数量和总数量作为该待编码字符串的编码结果。
譬如,统计出的总数量Len为25,第二数量Count为9,按照输出顺序对目标数字排序后,编码值V为730429,编码结果为V,Count,Len。需要说明的是,比起传统的编码结果63118085,V=730429只有6个数值,少2个数值,提高了25%的压缩比,压缩能力得到了明显提高,压缩效果好。
S212、电子设备获取解码请求,该解码请求携带该编码结果。
S213、电子设备根据该解码请求生成参考字符串,该参考字符串中包括第一数量个第一预设字符、以及第二数量个第二预设字符,该第一数量等于该总数量和第二数量之差,该参考字符串的最高位字符为该第二预设字符。
譬如,对于Count=9,Len=25,则第一数量为16,该参考字符串的长度也为25个字符,且首字符为1,尾字符全部为1,也即该参考字符串的初始序列为1000000000000000011111111。
S214、电子设备对当前参考字符串进行编码,得到参考编码值,并在该编码值的尾部补充第一预设字符,以使该编码值和参考编码值具有相等数量的字符,之后判断补充后的编码值是否不小于参考编码值,若是,则将第二预设字符确定为解码字符,若否,则将第一预设字符确定为解码字符。
S215、电子设备根据该解码字符对当前参考字符串中字符的排列组合进行更新,并将更新后的参考字符串作为当前参考字符串,之后返回执行上述步骤S214,直至编码次数的累计值等于该总数量为止;之后根据所有解码字符生成解码结果。
譬如,由于通过实验得知,α 0=1.1时,y(n)≈1,所以,在对进行编码的过程中,可以取y(n)=1,也即T ny(n)=t n或V/y(n)=v,当然,该参考字符串的编码方法可以参见待编码字符串的编码方法,此处不再赘述,唯一不同的是,在最终得到字符0对应的第一编码区间D′ 0和字符1对应的第二编码区间D′ 1时,该参考字符串的参考编码值并非高位相同的数值,其应是直接取D′ 0的最小端点值作为T n,然后根据T ny(n)=t n得到参考编码值t n。整个解码过程如下:
获取Count=9,Len=25和V=730429,得到参考字符串1000000000000000011111111。
根据1000000000000000011111111得出其参考编码值t 0=55004691494,在编码值V的尾部补充0,得到V′=73042900000,发现V'>t 0,所以输出解码字符1;
Count=Count-1(只有解码出符号1时才减1),Len=Len-1。此时,对1000000000000000011111111 进行调整,得到参考字符串1100000000000000001111111,根据1100000000000000001111111得出其参考编码值t 1=85252570554,发现V'<t 1,输出0;
Count=Count-0(只有解码出符号1时才减1),Len=Len-1。此时,对1100000000000000001111111进行调整,得到参考字符串1010000000000000001111111,得出t 2=71640070554,发现V'>t 2,输出符号1;
Count=Count-1(只有解码出符号1时才减1),Len=Len-1。此时,对1010000000000000001111111进行调整,得到参考字符串1011000000000000000111111,根据1011000000000000000111111得出t 3=80789529037。V'<t 3,输出符号0;
以此类推,对1011000000000000000111111进行调整,得到参考字符串1010100000000000000111111,利用1010100000000000000111111继续解码。当Len=0时解码结束。请参见下表表2,其为上述解码过程所得的结果,由下表可知其解码结果为1010000110010101000100010,也即可以完整的得到待编码序列,也即上述编码方法为无损压缩方法,这种压缩方法主要应用于word等对还原度要求较高的数据类型。
t n V' 解码字符
55004691494 73042900000 1
85252570554 73042900000 0
71640070554 73042900000 1
80789529037 73042900000 0
76671747787 73042900000 0
74406968100 73042900000 0
73161339271 73042900000 0
72476243416 73042900000 1
72936138490 73042900000 1
73189080781 73042900000 0
73075097958 73042900000 0
73012407405 73042900000 1
73054355308 73042900000 0
73035391416 73042900000 1
73048032622 73042900000 0
73042296045 73042900000 1
73046093591 73042900000 0
73044358276 73042900000 0
73043403853 73042900000 0
73042878920 73042900000 1
73043181742 73042900000 0
73043022950 73042900000 0
73042935615 73042900000 0
73042887580 73042900000 1
73042937629 73042900000 0
表2
由上述可知,本实施例提供的数据编码方法,其中电子设备可以获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符,接着,根据该第二预设字符在待编 码字符串中连续出现的最大次数确定系数临界值,并从预设系数列表中选择出小于该系数临界值的编码系数,接着,根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间,接着,获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度,接着,判断当前待编码字符是否为该第一预设字符,若是,则将当前第一编码区间确定为目标区间,若否,则将当前第二编码区间确定为目标区间,接着,根据当前区间长度和当前编码系数计算下一区间长度,并根据该下一区间长度和目标区间的最小端点值计算区间上限值,接着,对该最小端点值和区间上限值之间的区间进行等分,得到两个子区间,将划分之后较小的子区间作为当前第一编码区间将划分之后较大的子区间作为当前第二编码区间,将下一待编码字符作为当前待编码字符,将下一区间长度作为当前区间长度,之后,返回执行获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度的操作,当所有待编码字符编码完毕时,电子设备获取当前第二编码区间的两个端点值,并判断该两个端点值彼此间当前最高位的数字是否相同,若是,将该相同的数字作为目标数字进行输出,并将与该最高位相邻的下一位作为当前最高位,之后返回执行获取当前第二编码区间的两个端点值的操作,若否,则按照输出顺序对该目标数字进行排序,得到编码值,之后统计该待编码字符串中字符的总数量、以及第二预设字符的第二数量,并将该编码值、第二数量和总数量作为该待编码字符串的编码结果,从而可以较好的实现二进制字符串的编码操作,压缩能力强,之后,电子设备可以获取解码请求,该解码请求携带该编码结果,接着,根据该解码请求生成参考字符串,该参考字符串中包括第一数量个第一预设字符、以及第二数量个第二预设字符,该第一数量等于该总数量和第二数量之差,该参考字符串的最高位字符为该第二预设字符,接着,对当前参考字符串进行编码,得到参考编码值,接着,在该编码值的尾部补充第一预设字符,以使该编码值和参考编码值具有相等数量的字符,并判断补充后的编码值是否不小于参考编码值,若是,则将第二预设字符确定为解码字符,若否,则将第一预设字符确定为解码字符,接着,根据该解码字符对当前参考字符串中字符的排列组合进行更新,并将更新后的参考字符串作为当前参考字符串,之后返回执行对当前参考字符串进行编码,得到参考编码值的操作,直至编码次数的累计值等于该总数量为止,最后可以根据所有解码字符生成解码结果,从而实现二进制字符串的无损压缩,方法简单,灵活性高。
根据上述实施例所描述的方法,本实施例将从数据编码装置的角度进一步进行描述,该数据编码装置具体可以作为独立的实体来实现,也可以集成在终端或服务器等电子设备中来实现,该电子设备可以包括智能手机、平板电脑和个人计算机等。
请参阅图3a,图3a具体描述了本发明实施例提供的数据编码装置,其可以包括:获取模块10、第一确定模块20、选择模块30、第二确定模块40、编码模块50和生成模块60,其中:
(1)获取模块10
获取模块10,用于获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符。
本实施例中,该待编码字符串包括二进制字符串,该第一预设字符可以为0,该第二预设字符可以为1。该预设的区间长度主要用于限定编码的初始空间大小,其可以是人为设定的100000000000,或者更大,具体可以根据实际需求而定。
(2)第一确定模块20
第一确定模块20,用于根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值。
本实施例中,可以通过查表的方式获取与该最大次数对应的系数临界值,其大小通常只与该待编码字符串中连续1的个数有关,连续1的个数越多,临界值越小。实际应用过程中,可以通过对大量样本的计算归纳出系数值与样本中连续1个数的关系,之后将不同连续1的个数对应的系数临界值存储在一个表格上,需要时,直接根据连续1的个数从表格中获取对应值即可,其中,该计算主要通过公式
Figure PCTCN2018088746-appb-000005
实现,其中,i,j,n∈[1,Len],Len为每一样本的字符总长度,p(n)>1,T为每一样本中所有符号的总统计值,f为每一样本中某个符号自己的统计值,O为某个符号之前的所有符号的累计统计值,α为系数值,通过p(n)的变化确定系数值与样本中连续1个数的关系。
(3)选择模块30
选择模块30,用于从预设系数列表中选择出小于该系数临界值的编码系数。
本实施例中,该预设系数列表可以是预先设定好的,其可以根据连续1的不同个数提前计算出对应的系数临界值,然后将这些系数临界值作为系数值按照从小到大或者从大到小的顺序存储在预设系数列表中,这样,当需要选出编码系数时,可以先从预设系数列表中选出小于当前的系数临界值的系数值,若选出的系数值为单个,可以直接作为编码系数,若选出的系数值为多个,可以通过随机或者其他设定方式从多个中选择一个作为编码系数。
(4)第二确定模块40
第二确定模块40,用于根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间。
本实施例中,可以先通过公式R′ 0=R 00对区间长度初始化,其中,R′ 0为初始化后的区间长度,R 0为预设的区间长度,α 0为编码系数,比如若α 0为1.1,R 0为100000000000,则初始化后的区间长度R′ 0为110000000000,此时,初始化区间可以为[0,110000000000],之后,对该初始化区间进行划分,得到较小的第一编码区间和第二编码区间。通常,对于二进制字符串,其划分方式可以是等分,也即该第一编码区间U′ 0=[0,54999999999],该第二编码区间U′ 1=[55000000000,110000000000],且U′ 0为字符0对应的区间,U′ 1为字符1对应的区间,其中,U′ 0和U′ 1的区间长度为R′ 0/2。
(5)编码模块50
编码模块50,用于利用该区间长度、编码系数、第一编码区间和第二编码区间对该待编码字符串进行编码,得到编码值。
例如,请参见图3b,该编码模块50具体可以包括:
第一获取子模块51,用于获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度。
本实施例中,该编码系数α n可以是不断变化的,也可以是个定值,当为定值时,其可以是α 0,当为非定值时,可以通过K(α n)=α nn-1,其中K(α n)可以是一个指定的定性函数,如加、乘、逻辑(重要或不重要)等。该待编码字符S n、第一编码区间U′ 0(n)、第二编码区间U′ 1(n)和区间长度R′ n是不断变化的, 每编码完一个字符,就更新一次,且若为首次编码,该待编码字符通常为该待编码字符串的首字符,此时,该编码系数为α 0,该第一编码区间为U′ 0(0),该第二编码区间为U′ 1(0),该区间长度(也即U′ 0(0)和U′ 1(0)的总长度)为R′ 0
确定子模块52,用于根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间。
例如,确定子模块52具体可以用于:
判断当前待编码字符是否为该第一预设字符;
若是,则将当前第一编码区间确定为目标区间;
若否,则将当前第二编码区间确定为目标区间。
本实施例中,在编码过程中,需要找到当前待编码字符S n对应的编码区间,比如,若当前待编码字符S n为0,则确定子模块52确定的目标区间为U′ 0,若当前待编码字符S n为1,则确定子模块52确定的目标区间为U′ 1
更新子模块53,用于根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,以对当前待编码字符进行编码;
例如,该更新子模块53具体可以用于:
1-1、根据当前区间长度和当前编码系数计算下一区间长度。
本实施例中,
Figure PCTCN2018088746-appb-000006
Len为待编码字符串的总长度,L S为待编码字符串中符号类型个数。例如,对于二进制字符串,由于其符号只包括0和1,故L S为2,此时,
Figure PCTCN2018088746-appb-000007
Figure PCTCN2018088746-appb-000008
等等。
1-2、获取历史已编码字符,并计算该历史已编码字符和当前待编码字符之中,该第一预设字符与第二预设字符的比值。
本实施例中,可以通过自适应概率统计模型来计算动态的比值f k,也即时刻计算历史已编码字符和当前待编码字符之中第一预设字符与第二预设字符之比,f k≠f k-1,比如,对于符号序列1010000110010101000100010,若当前编码字符为第三个字符,则该比值为1/2。
当然,也可以通过静态统计模型来计算静态的比值f k,也即直接将该比值定义为静态值(比如f k=1),f k=f k-1
1-3、根据该比值、下一区间长度和目标区间的最小端点值对当前第一编码区间和当前第二编码区间进行更新。
例如,上述步骤1-3具体可以包括:
根据该下一区间长度和最小端点值计算区间上限值;
利用该比值对该最小端点值和区间上限值之间的区间进行划分,得到两个子区间;
利用划分之后较小的子区间对当前第一编码区间进行更新,利用划分之后较大的子区间对当前第二编码区间进行更新。
本实施例中,区间上限值H′ n=L′ n-1+R′ n,L′ n-1为目标区间的最小端点,在自适应概率统计模型中, U′ 0(n)=[L′ n-1,L′ n-1+(f k/(f k+1))*R′ n-1],U′ 1(n)=[L′ n-1+(f k/(f k+1))*R′ n,H′ n]。在静态概率统计模型中,U′ 0(n)=[L′ n-1,L′ n-1+R′ n/2-1],U′ 1(n)=[L′ n-1+R′ n/2,H′ n]。
返回模块54,用于当编码完成时,将更新后的第一编码区间和更新后的第二编码区间作为当前第一编码区间和当前第二编码区间,将下一待编码字符作为当前待编码字符,并返回执行获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度的操作,直至所有待编码字符编码完毕。
本实施例中,当n=Len时,整个待编码字符串编码完毕,此时,会得到U′ 1(n)和U′ 0(n)。
当然,针对于每个待编码字符的编码操作,其编码流程是相同的,也即该该待编码字符串的编码过程是个不断循环的过程,为使循环能正常进行,对于每次循环涉及编码系数和区间长度应不断更新,也即,该返回模块54还可以用于:
当编码完成时,统计该历史已编码字符的个数;
根据历史已编码字符的个数和当前编码系数计算下一编码系数;
将下一编码系数作为当前编码系数,将下一区间长度作为当前区间长度。
本实施例中,该编码系数α n可以是静态值,比如α n=α 0,也可以是动态值,当为动态值时,K(α n)=α nn-1,其中,n-1为历史已编码字符的个数,当此次编码操作完成时,对于下次编码操作,其编码系数为α n,区间长度为R′ n
此外,由于所有待编码字符编码完毕时,得到的是U′ 1(n)和U′ 0(n),而该待编码字符串的编码值可以是U′ 1(n)和U′ 0(n)中的任意一个值,因此,请参见图3c,该编码模块50还可以包括:
第二获取子模块55,用于当所有待编码字符编码完毕时,获取当前第二编码区间的两个端点值;
判断子模块56,用于判断该两个端点值彼此间当前最高位的数字是否相同;
输出子模块57,用于若判断结果指示是,则将该相同的数字作为目标数字进行输出,并将与该最高位相邻的下一位作为当前最高位,之后返回执行获取当前第二编码区间的两个端点值的步骤,直至判断结果指示否;
排序子模块58,用于按照输出顺序对该目标数字进行排序,得到编码值。
本实施例中,若U′ 1(n)=[73042919870,73042952160],则编码值V为730429。
(6)生成模块60
生成模块60,用于根据该编码值生成编码结果,并输出该编码结果。
例如,该生成模块60具体可以用于:
统计该待编码字符串中字符的总数量、以及第二预设字符的第二数量;
将该编码值、第二数量和总数量作为该待编码字符串的编码结果。
本实施例中,若待编码字符串为1010000110010101000100010,则总数量Len为25,第二数量Count为9,编码结果为V,Count,Len。
此外,在输出编码结果之后,还涉及到解码操作,也即,该数据编码装置还可以包括解码模块,用于:
在输出编码结果之后,获取解码请求,该解码请求携带该编码结果;
根据该解码请求生成参考字符串,该参考字符串中包括第一数量个第一预设字符、以及第二数量个 第二预设字符,该第一数量等于该总数量和第二数量之差,该参考字符串的最高位字符为该第二预设字符;
根据该参考字符串对该编码值进行解码。
2-1、获取解码请求,该解码请求携带该编码结果;
2-2、根据该解码请求生成参考字符串,该参考字符串中包括第一数量个第一预设字符、以及第二数量个第二预设字符,该第一数量等于该总数量和第二数量之差,该参考字符串的最高位字符为该第二预设字符。
本实施例中,对于Count=9,Len=25,则第一数量为16,该参考字符串的长度也为25个字符,且首字符为1,尾字符全部为1,也即该参考字符串的初始序列为1000000000000000011111111。
2-3、根据该参考字符串对该编码值进行解码。
例如,该解码模块进一步可以用于:
对当前参考字符串进行编码,得到参考编码值;
根据该参考编码值和编码值确定解码字符,并根据该解码字符对当前参考字符串中字符的排列组合进行更新;
将更新后的参考字符串作为当前参考字符串,并返回执行对当前参考字符串进行编码的步骤,直至编码次数的累计值等于该总数量为止;
根据所有解码字符生成解码结果。
本实施例中,对于每个参考字符串,均可以按照上述编码方法对参考字符串中的每一字符进行编码,其预设的区间长度也为R 0,其初始的编码系数也为α 0,当参考字符串中的全部字符都编码完毕时,最终得到字符0对应的第一编码区间D′ 0和字符1对应的第二编码区间D′ 1,不同的是,该参考字符串的编码值(也即参考编码值)并非高位相同的数值,其可以是先直接取D′ 0的最小端点值作为T n,然后根据指定函数t n=T ny(n)来计算参考编码值t n,其中函数y(n)是通过试验得出的,其与α n有关,之后,根据参考编码值t n和编码值V确定解码字符,该解码字符主要包括1和0。当编码次数达到Len时,可以将所有解码字符按照确定顺序排列,最终得到的字符序列也即解码结果。
需要说明的是,该当前参考字符串是不断变化的,每次变化是根据上一参考字符串及其参考编码值t n而定,也即根据上一参考字符串及其参考编码值t n对当前参考字符串中字符1和0的排列位置进行重新调整。该调整主要涉及字符1的移动,比如,若上一参考字符串为1100000000000000001111111,当解码字符为1时,需要将1100000000000000001111111中尾部最高位的字符1移动到前端最低位的字符1的后面,得到1110000000000000000111111(也即更新后的参考字符串),当解码字符为0时,需要将1100000000000000001111111中前端最低位的字符1往后移动一位,得到1010000000000000001111111(也即更新后的参考字符串)。
进一步地,该解码模块可以用于:
在该编码值的尾部补充第一预设字符,以使该编码值和参考编码值具有相等数量的字符;
判断补充后的编码值是否不小于参考编码值;
若是,则将第二预设字符确定为解码字符;
若否,则将第一预设字符确定为解码字符。
本实施例中,当V=730429时,对于第n次编码操作,若参考编码值t n=85252570554,则V补0之后可以变为V′=73042900000,此时,t n>V′,得到的解码字符为0,若参考编码值t n=55004691494,此时,t n≤V′,得到的解码字符为1。
具体实施时,以上各个单元可以作为独立的实体来实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单元的具体实施可参见前面的方法实施例,在此不再赘述。
由上述可知,本实施例提供的数据编码装置,通过获取模块10获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符,第一确定模块20根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值,之后,选择模块30从预设系数列表中选择出小于该系数临界值的编码系数,第二确定模块40根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间,之后,编码模块50利用该区间长度、编码系数、第一编码区间和第二编码区间对该待编码字符串进行编码,得到编码值,生成模块60根据该编码值生成编码结果,并输出该编码结果,从而可以较好的实现二进制数据的无损压缩,压缩能力强,压缩效果好。
相应的,本发明实施例还提供一种电子设备,如图4所示,其示出了本发明实施例所涉及的电子设备的结构示意图,具体来讲:
该电子设备可以包括一个或者一个以上处理核心的处理器701、一个或一个以上计算机可读存储介质的存储器702、电源703和输入单元704等部件。本领域技术人员可以理解,图4中示出的电子设备结构并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
处理器701是该电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器702内的软件程序和/或模块,以及调用存储在存储器702内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。可选的,处理器701可包括一个或多个处理核心;优选的,处理器701可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器701中。
存储器702可用于存储软件程序以及模块,处理器701通过运行存储在存储器702的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器702可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器702可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器702还可以包括存储器控制器,以提供处理器701对存储器702的访问。
电子设备还包括给各个部件供电的电源703,优选的,电源703可以通过电源管理系统与处理器701逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源703还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
该电子设备还可包括输入单元704,该输入单元704可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。
尽管未示出,电子设备还可以包括显示单元等,在此不再赘述。具体在本实施例中,电子设备中的处理器701会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储器702中,并由处理器701来运行存储在存储器702中的应用程序,从而实现各种功能,如下:
获取待编码字符串、以及预设的区间长度,该待编码字符串中包括多个第一预设字符和第二预设字符;
根据该第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;
从预设系数列表中选择出小于该系数临界值的编码系数;
根据该区间长度和编码系数确定该第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间;
利用该区间长度、编码系数、第一编码区间和第二编码区间对该待编码字符串进行编码,得到编码值;
根据该编码值生成编码结果,并输出该编码结果。
该电子设备可以实现本发明实施例所提供的任一种数据编码装置所能实现的有效效果,详见前面的实施例,在此不再赘述。
本领域普通技术人员可以理解,上述实施例的各种方法中的全部或部分步骤可以通过指令来完成,或通过指令控制相关的硬件来完成,该指令可以存储于一计算机可读存储介质中,并由处理器进行加载和执行。
为此,本发明实施例提供一种存储介质,其中存储有多条指令,该指令能够被处理器进行加载,以执行本发明实施例所提供的任一种数据编码方法中的步骤。
其中,该存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、磁盘或光盘等。
由于该存储介质中所存储的指令,可以执行本发明实施例所提供的任一种数据编码方法中的步骤,因此,可以实现本发明实施例所提供的任一种数据编码方法所能实现的有益效果,详见前面的实施例,在此不再赘述。
以上各个操作的具体实施可参见前面的实施例,在此不再赘述。
以上对本发明实施例所提供的一种数据编码方法、装置、存储介质和电子设备进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (14)

  1. 一种数据编码方法,其包括:
    获取待编码字符串、以及预设的区间长度,所述待编码字符串中包括多个第一预设字符和第二预设字符;
    根据所述第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;
    从预设系数列表中选择出小于所述系数临界值的编码系数;
    根据所述区间长度和编码系数确定所述第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间;
    利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值;
    根据所述编码值生成编码结果,并输出所述编码结果。
  2. 根据权利要求1所述的数据编码方法,其中所述利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,包括:
    获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度;
    根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间;
    根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,以对当前待编码字符进行编码;
    当编码完成时,将更新后的第一编码区间和更新后的第二编码区间作为当前第一编码区间和当前第二编码区间,将下一待编码字符作为当前待编码字符,并返回执行获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度的操作,直至所有待编码字符编码完毕。
  3. 根据权利要求2所述的数据编码方法,其中所述根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间,包括:
    判断当前待编码字符是否为所述第一预设字符;
    若是,则将当前第一编码区间确定为目标区间;
    若否,则将当前第二编码区间确定为目标区间。
  4. 根据权利要求2所述的数据编码方法,其中所述根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,包括:
    根据当前区间长度和当前编码系数计算下一区间长度;
    获取历史已编码字符,并计算所述历史已编码字符和当前待编码字符之中,所述第一预设字符与第二预设字符的比值;
    根据所述比值、下一区间长度和目标区间的最小端点值对当前第一编码区间和当前第二编码区间进行更新。
  5. 根据权利要求4所述的数据编码方法,其中所述根据所述比值、下一区间长度和目标区间的最小端点值对当前第一编码区间和当前第二编码区间进行更新,包括:
    根据所述下一区间长度和最小端点值计算区间上限值;
    利用所述比值对所述最小端点值和区间上限值之间的区间进行划分,得到两个子区间;
    利用划分之后较小的子区间对当前第一编码区间进行更新,利用划分之后较大的子区间对当前第二编码区间进行更新。
  6. 根据权利要求4所述的数据编码方法,其中当编码完成时,还包括:
    统计所述历史已编码字符的个数;
    根据历史已编码字符的个数和当前编码系数计算下一编码系数;
    将下一编码系数作为当前编码系数,将下一区间长度作为当前区间长度。
  7. 根据权利要求2所述的数据编码方法,其中所述利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值,还包括:
    当所有待编码字符编码完毕时,获取当前第二编码区间的两个端点值;
    判断所述两个端点值彼此间当前最高位的数字是否相同;
    若判断结果指示是,则将所述相同的数字作为目标数字进行输出,并将与所述最高位相邻的下一位作为当前最高位,之后返回执行获取当前第二编码区间的两个端点值的步骤,直至判断结果指示否;
    按照输出顺序对所述目标数字进行排序,得到编码值。
  8. 根据权利要求1所述的数据编码方法,其中所述根据所述编码值生成编码结果,包括:
    统计所述待编码字符串中字符的总数量、以及第二预设字符的第二数量;
    将所述编码值、第二数量和总数量作为所述待编码字符串的编码结果。
  9. 根据权利要求1所述的数据编码方法,其中在输出编码结果之后,还包括:
    获取解码请求,所述解码请求携带所述编码结果;
    根据所述解码请求生成参考字符串,所述参考字符串中包括第一数量个第一预设字符、以及第二数量个第二预设字符,所述第一数量等于所述总数量和第二数量之差,所述参考字符串的最高位字符为所述第二预设字符;
    根据所述参考字符串对所述编码值进行解码。
  10. 根据权利要求9所述的数据编码方法,其中所述根据所述参考字符串对所述编码值进行解码,包括:
    对当前参考字符串进行编码,得到参考编码值;
    根据所述参考编码值和编码值确定解码字符,并根据所述解码字符对当前参考字符串中字符的排列组合进行更新;
    将更新后的参考字符串作为当前参考字符串,并返回执行对当前参考字符串进行编码的步骤,直至编码次数的累计值等于所述总数量为止;
    根据所有解码字符生成解码结果。
  11. 根据权利要求10所述的数据编码方法,其中所述根据所述参考编码值和编码值确定解码字符,包括:
    在所述编码值的尾部补充第一预设字符,以使所述编码值和参考编码值具有相等数量的字符;
    判断补充后的编码值是否不小于参考编码值;
    若是,则将第二预设字符确定为解码字符;
    若否,则将第一预设字符确定为解码字符。
  12. 一种数据编码装置,其包括:
    获取模块,用于获取待编码字符串、以及预设的区间长度,所述待编码字符串中包括多个第一预设字符和第二预设字符;
    第一确定模块,用于根据所述第二预设字符在待编码字符串中连续出现的最大次数确定系数临界值;
    选择模块,用于从预设系数列表中选择出小于所述系数临界值的编码系数;
    第二确定模块,用于根据所述区间长度和编码系数确定所述第一预设字符对应的第一编码区间、以及第二预设字符对应的第二编码区间;
    编码模块,用于利用所述区间长度、编码系数、第一编码区间和第二编码区间对所述待编码字符串进行编码,得到编码值;
    生成模块,用于根据所述编码值生成编码结果,并输出所述编码结果。
  13. 根据权利要求12所述的数据编码装置,其特征在于,所述编码模块具体包括:
    第一获取子模块,用于获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度;
    确定子模块,用于根据当前待编码字符从当前第一编码区间和当前第二编码区间中确定目标区间;
    更新子模块,用于根据当前区间长度、当前编码系数和目标区间对当前第一编码区间和当前第二编码区间进行更新,以对当前待编码字符进行编码;
    返回模块,用于当编码完成时,将更新后的第一编码区间和更新后的第二编码区间作为当前第一编码区间和当前第二编码区间,将下一待编码字符作为当前待编码字符,并返回执行获取当前待编码字符、当前编码系数、当前第一编码区间、当前第二编码区间和当前区间长度的操作,直至所有待编码字符编码完毕。
  14. 一种存储介质,其特征在于,所述存储介质存储有多条指令,所述指令适于处理器进行加载,以执行权利要求1至11任一项所述的数据编码方法中的步骤。
PCT/CN2018/088746 2017-08-30 2018-05-28 一种数据编码方法、装置以及存储介质 WO2019041918A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020187017276A KR20190038746A (ko) 2017-08-30 2018-05-28 데이터 인코딩 방법, 장치 및 저장매체

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710765880.5 2017-08-30
CN201710765880.5A CN109428602A (zh) 2017-08-30 2017-08-30 一种数据编码方法、装置以及存储介质

Publications (1)

Publication Number Publication Date
WO2019041918A1 true WO2019041918A1 (zh) 2019-03-07

Family

ID=65504156

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/088746 WO2019041918A1 (zh) 2017-08-30 2018-05-28 一种数据编码方法、装置以及存储介质

Country Status (3)

Country Link
KR (1) KR20190038746A (zh)
CN (1) CN109428602A (zh)
WO (1) WO2019041918A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113098524A (zh) * 2021-03-22 2021-07-09 北京达佳互联信息技术有限公司 信息编码方法、装置、电子设备及存储介质
CN113746599A (zh) * 2021-08-24 2021-12-03 湖南遥昇通信技术有限公司 编码方法、译码方法、终端、电子设备和存储介质
CN113766237A (zh) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 一种编码方法、解码方法、装置、设备及可读存储介质
CN117353751A (zh) * 2023-12-06 2024-01-05 山东万辉新能源科技有限公司 基于大数据的无人充电桩交易数据智能管理系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112104441B (zh) * 2020-11-05 2021-02-12 广州市玄武无线科技股份有限公司 业务数据包编码及解码方法、装置及系统
CN112330948B (zh) * 2021-01-04 2021-04-27 杭州涂鸦信息技术有限公司 红外遥控码匹配方法、装置、计算机设备和可读存储介质
CN116610265B (zh) * 2023-07-14 2023-09-29 济南玖通志恒信息技术有限公司 一种商务信息咨询系统的数据存储方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257366A (zh) * 2008-03-27 2008-09-03 华为技术有限公司 编解码方法、通讯系统及设备
CN101409759A (zh) * 2008-11-19 2009-04-15 上海大学 基于连“1”特性的jpeg图像再次编码检测方法
US20100007533A1 (en) * 2008-07-08 2010-01-14 Qualcomm Incorporated Cavlc run-before decoding scheme
CN106445890A (zh) * 2016-07-07 2017-02-22 湖南千年华光软件开发有限公司 数据处理方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257366A (zh) * 2008-03-27 2008-09-03 华为技术有限公司 编解码方法、通讯系统及设备
US20100007533A1 (en) * 2008-07-08 2010-01-14 Qualcomm Incorporated Cavlc run-before decoding scheme
CN101409759A (zh) * 2008-11-19 2009-04-15 上海大学 基于连“1”特性的jpeg图像再次编码检测方法
CN106445890A (zh) * 2016-07-07 2017-02-22 湖南千年华光软件开发有限公司 数据处理方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113098524A (zh) * 2021-03-22 2021-07-09 北京达佳互联信息技术有限公司 信息编码方法、装置、电子设备及存储介质
CN113746599A (zh) * 2021-08-24 2021-12-03 湖南遥昇通信技术有限公司 编码方法、译码方法、终端、电子设备和存储介质
CN113746599B (zh) * 2021-08-24 2024-03-22 湖南遥昇通信技术有限公司 编码方法、译码方法、终端、电子设备和存储介质
CN113766237A (zh) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 一种编码方法、解码方法、装置、设备及可读存储介质
CN117353751A (zh) * 2023-12-06 2024-01-05 山东万辉新能源科技有限公司 基于大数据的无人充电桩交易数据智能管理系统
CN117353751B (zh) * 2023-12-06 2024-02-23 山东万辉新能源科技有限公司 基于大数据的无人充电桩交易数据智能管理系统

Also Published As

Publication number Publication date
CN109428602A (zh) 2019-03-05
KR20190038746A (ko) 2019-04-09

Similar Documents

Publication Publication Date Title
WO2019041919A1 (zh) 一种数据编码方法、装置以及存储介质
WO2019041918A1 (zh) 一种数据编码方法、装置以及存储介质
US8933825B2 (en) Data compression systems and methods
US11722148B2 (en) Systems and methods of data compression
KR101725223B1 (ko) 저장 장치에서의 데이터 압축 방법
WO2023045204A1 (zh) 一种有限状态熵编码表的生成方法、系统、介质及设备
JP2006126810A (ja) 後方適応規則を用いた整数データの無損失適応Golomb−Rice符号化および復号化
CN114065704A (zh) 数据压缩方法、电子设备和计算机程序产品
JP2006129467A (ja) 整数データの無損失適応符号化・復号化
Hidayat et al. Survey of performance measurement indicators for lossless compression technique based on the objectives
WO2024138981A1 (zh) 数据压缩和解压缩方法、装置、电子设备及存储介质
CN115811317A (zh) 一种基于自适应不解压直接计算的流处理方法和系统
WO2022027862A1 (zh) 神经网络模型量化方法以及装置
CN114024551A (zh) 数据无损压缩方法、系统、电子设备及介质
CN108989825B (zh) 一种算术编码方法、装置及电子设备
Song et al. A 49.5 mW multi-scale linear quantized online learning processor for real-time adaptive object detection
CN111181568A (zh) 数据压缩装置及方法、数据解压装置及方法
WO2022206144A1 (zh) 数据压缩方法及装置
CN117786169B (zh) 一种数据自适应存储方法、装置、电子设备及存储介质
CN115310409B (zh) 一种数据编码的方法、系统、电子装置和存储介质
TWI785546B (zh) 浮點數的編碼與解碼的方法與裝置
CN117971838B (zh) 向量数据存储方法、查询方法、装置、设备及存储介质
KR102526387B1 (ko) 연속적인 부분 신드롬 검색을 이용하는 고속 연판정 기반 선형 부호의 복호화 방법 및 장치
US7053803B1 (en) Data compression
CN116567239A (zh) 编解码方法、装置、编解码器、设备及介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 20187017276

Country of ref document: KR

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18850553

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18850553

Country of ref document: EP

Kind code of ref document: A1