CN110602498A - Self-adaptive finite state entropy coding method - Google Patents

Self-adaptive finite state entropy coding method Download PDF

Info

Publication number
CN110602498A
CN110602498A CN201910890254.8A CN201910890254A CN110602498A CN 110602498 A CN110602498 A CN 110602498A CN 201910890254 A CN201910890254 A CN 201910890254A CN 110602498 A CN110602498 A CN 110602498A
Authority
CN
China
Prior art keywords
data
state
coded
decoding
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910890254.8A
Other languages
Chinese (zh)
Other versions
CN110602498B (en
Inventor
唐驰鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201910890254.8A priority Critical patent/CN110602498B/en
Publication of CN110602498A publication Critical patent/CN110602498A/en
Application granted granted Critical
Publication of CN110602498B publication Critical patent/CN110602498B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Abstract

The invention discloses a self-adaptive finite state entropy coding method, which relates to the field of data compression and comprises the following steps: scanning data to be encoded to obtain a frequency set of symbols, preprocessing the frequency set, dynamically maintaining and updating the frequency set and an accumulative distribution set, and performing adaptive encoding by combining with reforming processing based on an encoding rule to obtain encoded output data; establishing an initial frequency set with all elements of 1, reading in data to be decoded, performing self-adaptive decoding based on a decoding rule and combined with inverse reforming processing, and dynamically maintaining and updating the frequency set and an accumulative distribution set to obtain decoding output data; and transforming the alphabet set of the data to be coded and the alphabet set of the coded output data, and carrying out self-adaptive finite state entropy coding on the data to be coded to obtain encrypted data. The invention can simplify the coding steps and improve the coding speed on the premise of ensuring the coding precision, and can better meet the coding requirements at the present stage.

Description

Self-adaptive finite state entropy coding method
Technical Field
The invention relates to the technical field of data coding, in particular to a self-adaptive finite state entropy coding method.
Background
Entropy coding is a lossless data compression method based on the information entropy theory, and common coding includes: shannon code, huffman code and arithmetic code, which are widely used in compressing various data such as image, video, voice, text, etc.
In the internet field, the data compression technology not only reduces the storage requirement, but also reduces the bandwidth occupation of data transmission, which greatly saves the data storage and transmission cost, and the lossless compression is always a hot spot for studying by scholars at home and abroad. More excellent huffman coding and arithmetic coding as entropy coding have been widely used in various fields with continuous technological improvements.
Huffman coding, also called optimal coding, is a variable length coding scheme that constructs a codeword with the shortest average word length by the frequency of occurrence of information characters, but cannot always approach the information entropy well because huffman cannot be smaller than 1 for a single character code length, and since it is necessary to construct a huffman tree, adaptive huffman coding requires dynamic adjustment of the huffman tree, which is a complex and inefficient process. The practical improved scheme is normal Huffman coding, the scheme does not need to build a tree in coding and decoding, the Huffman coding speed is greatly improved, but the method is difficult to realize self-adaption.
The principle of arithmetic coding is that a corresponding interval is constructed by the probability of statistical information characters, interval division is continuously carried out according to input characters, and finally a decimal belonging to a range [0,1 ] is output, the theory is perfect mathematically, but the decimal with infinite precision cannot be directly realized on a computer because the decimal needs to be expressed, and after continuous improvement, practical CACM87 and Q-encoder appear, but no matter which improvement leads to precision loss and complexity improvement of arithmetic coding, in practice, the arithmetic coding can only obtain a compression ratio slightly better than Huffman coding, and is far less than Huffman coding in coding and decoding speed.
The principle of the asymmetric digital system is to construct a corresponding coding table by the probability distribution of information characters, and complete coding and decoding in the coding table through state transition. Asymmetric digital systems include three variations: the unified asymmetric binary system, the range variable asymmetric digital system and the table entry asymmetric digital system are not self-adaptive coding methods, and the unified asymmetric binary system has the highest coding precision but only codes the binary system, and the precision of the unified asymmetric binary system and the precision of the table entry asymmetric digital system are sequentially reduced.
The invention provides a self-adaptive finite state entropy coding method based on the original asymmetric digital system, which can further approach the information entropy, and simultaneously, the data encryption is completed by transforming a coding table and outputting a coding symbol set on the basis of the scheme, so that the method can be widely applied to various data compression and encryption scenes.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a self-adaptive finite state entropy coding method, which avoids pre-storing frequency or probability information in coded data and also avoids the problem that a static coding method cannot adapt to information with large statistical rule change, is consistent with arithmetic coding in coding precision and has higher coding speed than the arithmetic coding.
In order to achieve the above purposes, the technical scheme adopted by the invention is as follows:
the invention discloses a self-adaptive finite state entropy coding method, which comprises a coding process, wherein the coding process comprises the following steps:
and (3) encoding flow: scanning to-be-coded data to obtain a frequency set of symbols, preprocessing the frequency set, dynamically maintaining and updating the frequency set and an accumulative distribution set according to the current to-be-coded symbols, and performing adaptive coding by combining with reforming processing based on a coding rule to obtain coded output data;
and (3) decoding flow: establishing an initial frequency set with all elements of 1, reading in data to be decoded, performing self-adaptive decoding based on a decoding rule and combined with inverse reforming processing, and dynamically maintaining an updated frequency set and an accumulated distribution set according to a symbol output by current decoding to obtain decoding output data;
and (3) encryption flow: and transforming the alphabet set of the data to be coded and the alphabet set of the coded output data, and carrying out self-adaptive finite state entropy coding on the data to be coded according to the two transformed sets to obtain the encrypted data.
On the basis of the above technical solution, the preprocessing specifically includes performing a self-increment 1 update operation on all elements in the original frequency set.
On the basis of the above technical solution, the encoding process includes the following steps:
scanning the data to be encoded to obtain a frequency set and an alphabet set of symbols, and initializing the alphabet set of encoding output data;
performing self-increment 1 preprocessing on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing a value of a state;
accessing the data to be coded in a reverse order, updating the state based on a coding rule and by combining with a reforming treatment according to a current symbol to be coded updating frequency set and a corresponding cumulative distribution set, and outputting a coded symbol;
repeatedly reforming the state until the state returns to zero;
wherein the alphabet set is denoted as Σ ═ s1,s2,…,sn};
The encoding output alphabet set is t ═ t0,t1,…,tγ-1};
The state is a variable x;
the frequency set is defined asSimplified representation is F ═ F1,F2,…,Fn};
The cumulative distribution set is defined as A, the element AiThe definition is as follows:
on the basis of the above technical solution, the encoding process includes the following steps:
m1, when the data to be coded is empty, entering step M3, otherwise reading a character siFor element F in frequency set FiPerforming a self-decreasing 1 update, and accordingly updating the cumulative distribution set A, and then entering step M2;
m2, from CurrentFrequency set F and cumulative distribution set a, and the current state x and the symbol s to be encodediBy bringing into C (x, s)i) In the method, a new state x 'is obtained by calculation, and then x' is substituted into The method comprises the following steps: when in useThen a coded symbol t is outputx′modγAnd changing the current state x toNamely, it isRepeating the step M2 whenThen the current state x is changed to x' and step M1 is re-entered;
m3, when the data to be coded is empty and the state x is still greater than 0, repeatedly bringing x into The method comprises the following steps: when in useThen a coded symbol t is outputx modγAnd changing the current state x toNamely, it isRepeating the step M3 whenThen a coded symbol t is outputx modγAnd the final state x is 0, and the encoding is finished.
On the basis of the above technical solution, the decoding process includes the following steps:
initializing a frequency set with elements of 1, generating a corresponding cumulative distribution set, initializing an alphabet set of data to be decoded and an alphabet set of decoding output data, and initializing a value of a state;
accessing data to be decoded in a reverse order, finishing state updating and decoding symbol output based on a decoding rule by combining a reverse reforming technology, and updating a frequency set and an accumulated distribution set;
repeatedly performing inverse reforming on the state variable, wherein if the final state is consistent with the initial state of the code, the decoding is successful, otherwise, the decoding is wrong;
wherein, the data to be decoded is the coded output data in the coding step;
the decoding output data is the data to be encoded in the encoding step.
On the basis of the above technical solution, the decoding process includes the following steps:
n1, when the length of the decoded output data is larger than or equal to the length of the original data, entering a step N2, otherwise, processing according to the current state x: when x < AnThen, x is inversely rearranged, and a character t to be decoded is read in from the reverse orderiIf the reading fails, go to the next step N2, otherwise, the current state x and the read-in t are determinediInto Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step N1; when x is more than or equal to AnWhen x is introduced into D (x) ═ x', si) Wherein s isiAs the decoding output, the current state x is changed to x ', that is, x ═ x', and F in the frequency set F is comparediUpdating by adding 1, correspondingly updating the cumulative distribution set A, and repeating the step N1;
n2, reading in a to-be-decoded fileCharacter tiIf the reading is successful, substituting the current state x into Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step N2; when the reading fails, whether the current state x is equal to the initial state of the code or not is judged, wherein the equality indicates that the data is correctly recovered, and the inequality indicates that the data is wrong or tampered.
On the basis of the technical scheme, the encryption process comprises the following steps:
transforming the alphabet set of the data to be encoded according to the key to obtain a new alphabet set of the data to be encoded;
obtaining a new encoding output alphabet set according to the key transformation encoding output alphabet set;
and performing self-adaptive finite state entropy coding on the data to be coded through the two new alphabet sets to complete data encryption.
Compared with the prior art, the invention has the advantages that:
the self-adaptive finite state entropy coding method avoids pre-storing frequency or probability information in coded data, also avoids the problem that a static coding method cannot adapt to information with large change of statistical rules, can provide stable and reliable compression ratio for any data, has higher coding speed, and can better meet the coding requirements at the present stage.
Drawings
FIG. 1 is a flow chart of adaptive finite state entropy encoding;
FIG. 2 is a flow chart of decoding of adaptive finite state entropy;
FIG. 3 is an example of adaptive finite state entropy encoding;
FIG. 4 is an example of adaptive finite state entropy decoding;
FIG. 5 is a schematic diagram of a binary index tree;
fig. 6 is an example of encryption based on adaptive finite state entropy coding.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
The embodiment of the invention provides a self-adaptive finite state entropy coding method, which can avoid pre-storing frequency information in coded data, greatly improve the coding precision, and has the advantages of simple coding rule, high coding speed and excellent comprehensive performance.
In order to achieve the technical effects, the general idea of the application is as follows:
a method of adaptive finite state entropy coding, comprising the steps of:
s1, scanning the data to be coded to obtain a frequency set;
s2, performing self-increment 1 updating on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing the value of the state;
s3, trying to read a character to be coded, if the reading is successful, entering the step S4, otherwise, entering the step S6;
s4, updating the frequency set according to the current character to be coded, and updating the cumulative distribution set correspondingly;
s5, based on the coding rule and combined with the reforming processing, updating the state and outputting the codes;
and S6, repeating the reforming processing until the state returns to zero.
It should be noted that the setting of the initial state of the above coding needs to avoid state loop — the new state is the same as the old state in the coding process, and there is no code output, and the coding falls into a dead loop.
The above-mentioned rearrangement processing is to avoid a problem that the coding state is infinitely increased and cannot be expressed in a computer, and after the rearrangement processing, the coding state can be limited and the output of the code symbol can be ensured.
Example one
Referring to fig. 1, an embodiment of the present invention provides an adaptive finite state entropy coding method, where the method includes an encryption process, where the encoding process includes the following steps:
s1, scanning original data to be coded, and calculating an initial alphabet set sigma and a frequency set F;
s2, ensuring that F does not appear in the frequency set F after the self-adaptive finite state entropy coding is finishediWhen the frequency set is 0, it is necessary to add 1 to all elements of the first statistical frequency set, i.e. to generate a new set F ═ { F ═ F1+1,F2+1,…,Fn+1, and calculating the corresponding cumulative distribution set A according to the elements in the new frequency set F. Meanwhile, in order to ensure that the decoded data has the same sequence as the original data, the data needs to be read according to the sequence from the tail to the head, and finally, an initial value is given to the state x, wherein x satisfies the requirement Wherein s iszIf the first character to be coded is represented, if the data needs to be checked after the decoding is finished, x is any random integer meeting the condition, and the initial state x and the original data length need to be stored in the head of the coded output data, otherwise, default checking is used, namely x is Fz-1, as shown in fig. 3, x ═ Fz-1=F1-1=2-1=1;
S3, trying to read a character for encoding according to the sequence from the tail to the head, if the reading fails, indicating that the data to be encoded is completely read, entering the step S6, otherwise entering the step S4;
s4, reading character S according to currentiFor element F in frequency set FiPerforming a self-subtraction operation to generate a new set F ═ F1,F2,…,Fi-1,…,FnAt this time due to FiIf the value changes, the step S5 is proceeded after the update of the cumulative distribution set a needs to be completed in the binary index tree, as shown in fig. 5, the cumulative distribution set a does not actually exist, when F in the frequency set FiWhen changing, according to the property of binary index tree, only log (n) elements in the set A' need to be updated in worst case, and when encoding the character siThen, in the worst case, only log (n) elements in the set A' need to be summed to obtain the codeFunction C (x, s)i) A in (A)i-1
S5, the step reforms the state x, according to the current frequency set F and the cumulative distribution set A, the current state x and the symbol S to be codediBy bringing into C (x, s)i) In the method, a new state x 'is obtained by calculation, and then x' is substituted into a renormalization functionAnd discussed in two cases:
case 1: if it is calculatedThen a coded symbol t is outputx′modγAnd changing the current state x toNamely, it isAnd the present step S5 is repeated again,
case 2: if it is calculatedThe current state x is changed to x ', that is, x ═ x', and the process proceeds to step S3 to encode the next character;
s6, this step corresponds to the coded data being read, but the coding is not completely completed, and the state x is still larger than 0, so x needs to be repeatedly brought intoUp to x ═ 0, the same is discussed here in two cases:
case 1: if it is calculatedThen a coded symbol t is outputx modγAnd changing the current state x toNamely, it isThe present step S6 is repeated again,
case 2: if it is calculatedThen a coded symbol t is outputx modγWhen the final state x is 0, the encoding is finished, and as shown in fig. 3, the state x is 0 at the end of the encoding;
wherein the alphabet set Σ ═ s1,s2,…,snFrequency setSimplified representation is F ═ F1,F2,…,FnScanning the "ccbca" to be encoded data may result in the alphabet set Σ ═ { a, b, c } and the initial frequency set F ═ 1,2,3}, as shown in fig. 3;
the definition of cumulative distribution set a is related to frequency set F, and is defined as follows:
coding function C (x, s)i) The definition is as follows:
encoding output encoding alphabet Γ ═ t0,t1,…,tγ-1Where Γ is defined as an ordered set of natural numbers, i.e., Γ ═ 0,1, …, γ -1, then γ ═ 2 and γ ═ 28Respectively corresponding to the bit stream and the byte stream;
coding-time reshaping D according to the definition of the encoding alphabet Γγ(x) The function is defined as follows:
whereinCorresponding to the reformed x,tx modγCorresponding to the output code character, since (x mod γ) is ∈ [1, γ -1 ∈]Is apparent tx modγMust be in the set of encoded output alphabets Γ.
Example two
Referring to fig. 2, the second embodiment of the present invention further provides an adaptive finite state entropy coding method, which further includes a corresponding decoding process for data decoding, where the decoding process includes the following steps:
a1, the decoding side and the encoding side have the same original data alphabet set Σ ═ s1,s2,…,snAnd the same set of encoding output alphabets Γ ═ t0,t1,…,tγ-1The encoding end state x is 0, and the frequency set is subjected to the operation of adding 1 at the beginning of encoding, so that the frequency set F is all 1 at the time of encoding end, and the corresponding initial frequency set for decoding is all 1, that is, F is { F ═ F }1,F2,…,FnAnd establishing a corresponding cumulative distribution set A according to F, wherein A is {1,1, …,1}, and then establishing a corresponding cumulative distribution set A according to F0,A1,…,AnAnd proceeds to step A2, according to A since F is all 1 siThe definition of (a) may yield a {0,1, 2.., n }, as shown in fig. 4, an initial frequency set F {1,1,1}, and a {0,1,2,3 };
a2, when the length of the decoded output data is less than the length of the original data, entering the step A3, otherwise entering the step A5;
a3, the step is mainly used for carrying out reverse reforming on the state x, and the following two conditions are processed according to the current state x:
case 1: if x is less than AnX is then inverse-reshaped, in which case an attempt is made to read in one character t of the data to be decoded in the order from beginning to endiIf the reading fails, the next step A4 is entered, otherwise the current state x and the read-in t are determinediCarry-in inverse-renormalization function Cγ(x,ti) A new state x ' is obtained and the current state x is changed to x ', i.e. x ═ x ', and step a2 is re-entered,
case 2: if x is not less than AnThen x is substituted into d (x) ═ x', si) Wherein s isiAs a decoding output, the current state x is changed to x ', that is, x ═ x', and the process proceeds to step a 4;
a4, s output from decodingiFor F in frequency set FiPerforming a self-increment 1 operation, i.e. F ═ F1,F2,…,Fi+1,…,FnFinishing updating the cumulative distribution set A by using a binary index tree, and then entering the step A2;
a5, when the decoding is completed, the original data is recovered, but the data to be decoded may still remain, so the attempt to read a character t of the data to be decodediIf the reading is successful, substituting the current state x into Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step a5, if the reading fails, determining whether the current state x is equal to the initial state of the encoding, if so, it indicates that the data is correctly recovered, otherwise, it indicates that the data is incorrect or tampered, as shown in fig. 3, the initial state of the encoding is 1, and the final state of the decoding shown in fig. 4 is also 1, and both are equal, and the decoding succeeds;
wherein the inverse reforming function Cγ(x,ti) The definition is as follows:
x′=Cγ(x,ti)=γ·x+i
in the above formula, x is the current state, tiFor the currently read code output symbol, carry it into Cγ(x,ti) A new state x' can be obtained;
wherein the decoding function d (x) is defined as follows:
D(x)=(x′,si) Wherein s isiSatisfies (x mod A)n)∈[Ai-1,Ai),
S is aboveiThe meaning of the formula (1) is: by looking up (x mod A) in set An) Closest and equal to or greater than Ai-1I.e. is the corresponding siThen s isiThe value of the intermediate value i is substituted into a corresponding function, and a new state x' can be obtained and used as the decoding state of the next round;
in step A3, according to AiCan deduce Ai=Ai-1+FiAnd F isi> 0, so the cumulative distribution set A can be viewed as an increasing sequence, decoding function D (x) for siThe lookup of (c) can reduce the temporal complexity to O (log (n)) by a binary lookup.
EXAMPLE III
The third embodiment of the present invention further provides a method of adaptive finite state entropy coding, which further includes the following steps for performance optimization:
when the alphabet set Σ ═ s1,s2,…,snWhen n in is 2, i.e. when encoding and decoding bit data, the frequency set F and the cumulative distribution set a contain only 2 and 3 elements, respectively, while a0≡0,A1≡F1The number of actually useful elements in the cumulative distribution set A is only 1, and the binary index tree is used to maintain the updated cumulative distribution set A, or the binary search is used to find s during decodingiDo not have acceleration effect, so in this case arrays or independent variables are used to maintain the elements in the frequency set F and cumulative distribution set A, and conditional branches are used to solve siThe efficiency is higher;
when the alphabet set Σ ═ s1,s2,…,snN in ═ 28When encoding and decoding data in byte unit, updating the element of binary index tree and obtaining some element A in cumulative distribution set Ai-1And using binary search for s in decodingiLog is needed in the worst case2The operation is carried out 8 times, and the maximum cycle number is fixed, so that the operation efficiency can be greatly improved by circularly expanding the operation;
when outputting the encoding alphabet set Γ ═ t0,t1,…,tγ-1γ in ═ 2mIn the meantime, the modulo, integer multiplication and integer division operations in the reforming process can be replaced by efficient bitwise and, left shift and right shift operations, and the corresponding reforming function and inverse reforming function are transformed as follows:
x′=Cγ(x,ti)=γ·x+i=x<<m+i
and increasing the step size of encoding and decoding, and directly using the array to complete the updating of the cumulative distribution set A. The encoding and decoding in the embodiment reads or outputs SL 1 character at a time and updates the cumulative distribution set a using a binary index tree, so a single character is needed in the worst caseAnd (5) performing secondary operation. If the array is used directly, it is necessary in the worst caseIn the second operation, it can be seen that the step size is enlarged to SL ≧ n, i.e. the cumulative distribution set A is updated every time SL characters are read in, then each character encoding or decoding updates the cumulative distribution set A onlyIn the second operation, after the step length is increased, the performance of updating the cumulative distribution set A is greatly improved by directly using the array.
Example four
As shown in fig. 6, the fourth embodiment of the present invention further provides an adaptive finite state entropy coding method, which further includes an encryption process for data encryption, where the encryption process includes the following steps:
set of alphabet of data to be encoded sigma and key KΣBy introducing the transformation function f (sigma, K)Σ) Obtaining a new alphabet set sigma' of the data to be coded;
encoding the output alphabet set Γ with the key KΓBy introducing the transformation function f (sigma, K)Γ) Obtaining a new encoding output letter set gamma';
and performing adaptive finite-state entropy coding on the data to be coded by the two new alphabet table sets Σ 'and Γ', so as to obtain encrypted data.
In the embodiment of the invention, the encryption flow can improve the safety when the encoding side and the decoding side carry out data interaction.
EXAMPLE five
The fifth embodiment of the present invention further provides a self-adaptive finite state entropy coding method, wherein the coding process of the method includes the following steps:
m1, when the data to be coded is empty, entering step M3, otherwise reading a character siFor element F in frequency set FiPerforming self-decreasing 1 updating, and updating the cumulative distribution set A correspondingly;
m2, the frequency set F and the cumulative distribution set A, the current state x and the symbol s to be codediBy bringing into C (x, s)i) In the method, a new state x 'is obtained by calculation, and then x' is substituted into The method comprises the following steps: when in useThen a coded symbol t is outputx′modγAnd changing the current state x toNamely, it isRepeating the step M2; when in useThen the current state x is changed to x' and step M1 is re-entered;
m3, if the data to be coded is empty and the state x is still greater than 0, repeatedly bringing x into The method comprises the following steps: when in useThen a coded symbol t is outputx modγAnd changing the current state x toNamely, it isRepeating the step M3; when in useThen a coded symbol t is outputx modγIf the final state x is 0, the encoding is finished;
it should be noted that, before the step M1, the encoding flow should further include the following steps in sequence:
scanning the data to be encoded to obtain a frequency set and an alphabet set of symbols, and initializing the alphabet set of encoding output data;
performing self-increment 1 preprocessing on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing a value of a state;
then, step M1 is started again, that is, it is determined whether the data to be encoded is empty, and then the subsequent steps are performed according to the determination result.
In another implementation manner of the embodiment of the present invention, a decoding flow of the method includes the following steps:
n1, when the length of the decoded output data is larger than or equal to the length of the original data, entering a step N2, otherwise, processing according to the current state x: when x < AnThen, x is inversely rearranged, and a character t to be decoded is read in from the reverse orderiIf the reading fails, go to the next step N1, otherwise, the current state x and the read-in t are determinediInto Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ≧ x ', and repeating the step N1 again, when x ≧ AnWhen x is introduced into D (x) ═ x', si) Wherein s isiAs the decoding output, the current state x is changed to x ', that is, x ═ x', and F in the frequency set F is comparediPerforming a self-increment 1 update, correspondingly updating the cumulative distribution set A, and repeating the step N1;
n2, reading a character t to be decoded in reverse orderiIf the reading is successful, substituting the current state x into Cγ(x,ti) Obtaining a new state x ', changing the current state x into x ', that is, x is equal to x ', and repeating the step N2, when the reading fails, determining whether the current state x is equal to the initial state of the code, where an equal value indicates that the data is correctly recovered, and an unequal value indicates that the data is incorrect or tampered;
it should be noted that, before step N1, the decoding flow should further include the following steps in sequence:
initializing a frequency set with elements of 1, generating a corresponding cumulative distribution set, initializing an alphabet set of data to be decoded and an alphabet set of decoding output data, and initializing a value of a state;
then, step N1 is started, i.e. it is determined whether the data length of the decoded output data has reached or exceeded the original data length, and then the subsequent steps are performed according to the determination result.
EXAMPLE six
The sixth embodiment of the present invention further provides a method for adaptive finite state entropy coding, which includes the following steps:
and (3) encoding flow: scanning to-be-coded data to obtain a frequency set of symbols, preprocessing the frequency set, dynamically maintaining and updating the frequency set and an accumulative distribution set according to the current to-be-coded symbols, and performing adaptive coding by combining with reforming processing based on a coding rule to obtain coded output data;
and (3) decoding flow: establishing an initial frequency set with all elements of 1, reading in data to be decoded, performing self-adaptive decoding based on a decoding rule and combined with inverse reforming processing, and dynamically maintaining an updated frequency set and an accumulated distribution set according to a symbol output by current decoding to obtain decoding output data;
and (3) encryption flow: and transforming the alphabet set of the data to be coded and the alphabet set of the coded output data, and carrying out self-adaptive finite state entropy coding on the data to be coded according to the two transformed sets to obtain the encrypted data.
The embodiment of the invention can avoid pre-storing frequency information in the coded data, can greatly improve the coding precision, and has simple coding rule, high coding speed and excellent comprehensive performance.
In another implementation manner of the embodiment of the present invention, the preprocessing in the encoding process of the method specifically includes performing a self-increment 1 update operation on all elements in the original frequency set.
In another implementation manner of the embodiment of the present invention, the encoding process includes the following steps:
scanning the data to be encoded to obtain a frequency set and an alphabet set of symbols, and initializing the alphabet set of encoding output data;
performing self-increment 1 preprocessing on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing a value of a state;
accessing the data to be coded in a reverse order, updating the state based on a coding rule and by combining with a reforming treatment according to a current symbol to be coded updating frequency set and a corresponding cumulative distribution set, and outputting a coded symbol;
repeatedly reforming the state until the state returns to zero;
wherein the alphabet set is denoted as Σ ═ s1,s2,…,sn};
The set of encoding output alphabet is t ═ t0,t1,…,tγ-1};
The state is variable x;
the frequency set is defined asSimplified representation is F ═ F1,F2,…,Fn};
The cumulative distribution set is defined as A, the element AiThe definition is as follows:
in another implementation manner of the embodiment of the present invention, an encoding flow of the method includes the following steps:
m1, when the data to be coded is empty, entering step M3, otherwise reading a character siFor element F in frequency set FiPerforming a self-decreasing 1 update, and accordingly updating the cumulative distribution set A, and then entering step M2;
m2, the frequency set F and the cumulative distribution set A, the current state x and the symbol s to be codediBy bringing into C (x, s)i) In the method, a new state x 'is obtained by calculation, and then x' is substituted into The method comprises the following steps: when in useThen a coded symbol t is outputx′modγAnd changing the current state x toNamely, it isRepeating the step M2; when in useThen the current state x is changed to x' and step M1 is re-entered;
m3, if the data to be coded is empty and the state x is still greater than 0, repeatedly bringing x into The method comprises the following steps: when in useThen a coded symbol t is outputx modγAnd changing the current state x toNamely, it isRepeating the step M3 whenThen a coded symbol t is outputx modγIf the final state x is 0, the encoding is finished;
it should be noted that, before the step M1, the encoding flow should further include the following steps in sequence:
scanning the data to be encoded to obtain a frequency set and an alphabet set of symbols, and initializing the alphabet set of encoding output data;
performing self-increment 1 preprocessing on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing a value of a state;
then, step M1 is started again, that is, it is determined whether the data to be encoded is empty, and then the subsequent steps are performed according to the determination result.
In another implementation manner of the embodiment of the present invention, the decoding process includes the following steps:
initializing a frequency set with elements of 1, generating a corresponding cumulative distribution set, initializing an alphabet set of data to be decoded and an alphabet set of decoding output data, and initializing a value of a state;
accessing data to be decoded in a reverse order, finishing state updating and decoding symbol output based on a decoding rule by combining a reverse reforming technology, and updating a frequency set and an accumulated distribution set;
repeatedly performing inverse reforming on the state variable, wherein if the final state is consistent with the initial state of the code, the decoding is successful, otherwise, the decoding is wrong;
wherein, the data to be decoded is the coded output data in the coding step;
the decoded output data is the data to be encoded in the encoding step.
In another implementation manner of the embodiment of the present invention, the specific steps of the decoding rule and the inverse renormalization in the method are as follows:
n1, when the length of the decoded output data is larger than or equal to the length of the original data, entering a step N2, otherwise, processing according to the current state x: when x < AnThen, x is inversely rearranged, and a character t to be decoded is read in from the reverse orderiIf the reading fails, go to the next step N2, otherwise, the current state x and the read-in t are determinediInto Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ≧ x ', and repeating the step N1 again, when x ≧ AnWhen x is introduced into D (x) ═ x', si) Wherein s isiAs the decoding output, the current state x is changed to x ', that is, x ═ x', and F in the frequency set F is comparediUpdating by adding 1, correspondingly updating the cumulative distribution set A, and repeating the step N1;
n2, reading a character t to be decodediIf the reading is successful, substituting the current state x into Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step N2; when the reading fails, judging whether the current state x is equal to the initial state of the code or not, wherein the equal represents the dataThe data is correctly recovered, and if not, the data is wrong or tampered;
it should be noted that, before step N1, the decoding flow should further include the following steps in sequence:
initializing a frequency set with elements of 1, generating a corresponding cumulative distribution set, initializing an alphabet set of data to be decoded and an alphabet set of decoding output data, and initializing a value of a state;
then, step N1 is started, i.e. it is determined whether the data length of the decoded output data has reached or exceeded the original data length, and then the subsequent steps are performed according to the determination result.
In another implementation manner of the embodiment of the present invention, the encryption process includes the following steps:
transforming the alphabet set of the data to be encoded according to the key to obtain a new alphabet set of the data to be encoded;
obtaining a new encoding output alphabet set according to the key transformation encoding output alphabet set;
and performing self-adaptive finite state entropy coding on the data to be coded through the two new alphabet sets to complete data encryption.

Claims (7)

1. An adaptive finite state entropy coding method, characterized in that it comprises the steps of:
and (3) encoding flow: scanning to-be-coded data to obtain a frequency set of symbols, preprocessing the frequency set, dynamically maintaining and updating the frequency set and an accumulative distribution set according to the current to-be-coded symbols, and performing adaptive coding by combining with reforming processing based on a coding rule to obtain coded output data;
and (3) decoding flow: establishing an initial frequency set with all elements of 1, reading in data to be decoded, performing self-adaptive decoding based on a decoding rule and combined with inverse reforming processing, and dynamically maintaining an updated frequency set and an accumulated distribution set according to a symbol output by current decoding to obtain decoding output data;
and (3) encryption flow: and transforming the alphabet set of the data to be coded and the alphabet set of the coded output data, and carrying out self-adaptive finite state entropy coding on the data to be coded according to the two transformed sets to obtain the encrypted data.
2. The finite state entropy coding method of claim 1, wherein the preprocessing is specifically a self-increment 1 update operation on all elements in the original frequency set.
3. The finite state entropy coding method of claim 1, wherein the encoding process comprises the steps of:
scanning the data to be encoded to obtain a frequency set and an alphabet set of symbols, and initializing the alphabet set of encoding output data;
performing self-increment 1 preprocessing on all elements of the frequency set, generating a corresponding cumulative distribution set, and initializing a value of a state;
accessing the data to be coded in a reverse order, updating the state based on a coding rule and by combining with a reforming treatment according to a current symbol to be coded updating frequency set and a corresponding cumulative distribution set, and outputting a coded symbol;
repeatedly reforming the state until the state returns to zero;
wherein the alphabet set is denoted as Σ ═ s1,s2,…,sn};
The encoding output alphabet set is t ═ t0,t1,…,tγ-1};
The state is a variable x;
the frequency set is defined asSimplified representation is F ═ F1,F2,…,Fn};
The cumulative distribution set is defined as A, the element AiThe definition is as follows:
4. the finite state entropy coding method of claim 1, wherein the encoding process comprises the steps of:
m1, when the data to be coded is empty, entering step M3, otherwise reading a character siFor element F in frequency set FiPerforming a self-decreasing 1 update, and accordingly updating the cumulative distribution set A, and then entering step M2;
m2, the frequency set F and the cumulative distribution set A, the current state x and the symbol s to be codediBy bringing into C (x, s)i) In the method, a new state x 'is obtained by calculation, and then x' is substituted into The method comprises the following steps: when in useThen a coded symbol t is outputx′ mod γAnd changing the current state x toNamely, it isRepeating the step M2 whenThen the current state x is changed to xAnd then to step M1;
m3, when the data to be coded is empty and the state x is still greater than 0, repeatedly bringing x into The method comprises the following steps: when in useThen a coded symbol t is outputx mod γAnd changing the current state x toNamely, it isRepeating the step M3 whenThen a coded symbol t is outputx mod γAnd the final state x is 0, and the encoding is finished.
5. The finite state entropy encoding method of claim 1, wherein the decoding process comprises the steps of:
initializing a frequency set with elements of 1, generating a corresponding cumulative distribution set, initializing an alphabet set of data to be decoded and an alphabet set of decoding output data, and initializing a value of a state;
accessing data to be decoded in a reverse order, finishing state updating and decoding symbol output based on a decoding rule by combining a reverse reforming technology, and updating a frequency set and an accumulated distribution set;
repeatedly performing inverse reforming on the state variable, wherein if the final state is consistent with the initial state of the code, the decoding is successful, otherwise, the decoding is wrong;
wherein, the data to be decoded is the coded output data in the coding step;
the decoding output data is the data to be encoded in the encoding step.
6. The finite state entropy encoding method of claim 1, wherein the decoding process comprises the steps of:
n1, when the length of the decoded output data is larger than or equal to the length of the original data, entering a step N2, otherwise, processing according to the current state x: when x is<AnThen, x is inverse-reshaped, and a character t to be decoded is read in from the reverse orderiIf the reading fails, go to the next step N2, otherwise, the current state x and the read-in t are determinediInto Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step N1; when x is more than or equal to AnWhen x is introduced into D (x) ═ x', si) Wherein s isiAs the decoding output, the current state x is changed to x ', that is, x ═ x', and F in the frequency set F is comparediUpdating by adding 1, correspondingly updating the cumulative distribution set A, and repeating the step N1;
n2, reading a character t to be decodediIf the reading is successful, substituting the current state x into Cγ(x,ti) Obtaining a new state x ', changing the current state x to x ', that is, x ═ x ', and repeating the step N2, when the reading fails, determining whether the current state x is equal to the initial state of the code, where an equal value indicates that the data is correctly recovered, and an unequal value indicates that the data is incorrect or tampered.
7. The finite state entropy coding method of claim 1, wherein the encryption process comprises the steps of:
transforming the alphabet set of the data to be encoded according to the key to obtain a new alphabet set of the data to be encoded;
obtaining a new encoding output alphabet set according to the key transformation encoding output alphabet set;
and performing self-adaptive finite state entropy coding on the data to be coded through the two new alphabet sets to complete data encryption.
CN201910890254.8A 2019-09-20 2019-09-20 Self-adaptive finite state entropy coding method Active CN110602498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910890254.8A CN110602498B (en) 2019-09-20 2019-09-20 Self-adaptive finite state entropy coding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910890254.8A CN110602498B (en) 2019-09-20 2019-09-20 Self-adaptive finite state entropy coding method

Publications (2)

Publication Number Publication Date
CN110602498A true CN110602498A (en) 2019-12-20
CN110602498B CN110602498B (en) 2022-03-01

Family

ID=68861477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910890254.8A Active CN110602498B (en) 2019-09-20 2019-09-20 Self-adaptive finite state entropy coding method

Country Status (1)

Country Link
CN (1) CN110602498B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113572479A (en) * 2021-09-22 2021-10-29 苏州浪潮智能科技有限公司 Method and system for generating finite state entropy coding table
CN116933734A (en) * 2023-09-15 2023-10-24 山东济矿鲁能煤电股份有限公司阳城煤矿 Intelligent diagnosis method for cutter faults of shield machine

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1535022A (en) * 2003-12-14 2004-10-06 浙江大学 Information ontropy holding decoding method and device
CN1560823A (en) * 2004-02-19 2005-01-05 李春林 Data encipher and decipher system based on dynamic variable-length code
CN101005603A (en) * 2006-01-18 2007-07-25 华中科技大学 Method and device for enciphering, deenciphering and transfer code of image data
CN101465724A (en) * 2009-01-06 2009-06-24 中国科学院软件研究所 Encrypted Huffman encoding method and decoding method
US20160248440A1 (en) * 2015-02-11 2016-08-25 Daniel Greenfield System and method for compressing data using asymmetric numeral systems with probability distributions
US20170164007A1 (en) * 2015-12-07 2017-06-08 Google Inc. Mixed boolean-token ans coefficient coding

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1535022A (en) * 2003-12-14 2004-10-06 浙江大学 Information ontropy holding decoding method and device
CN1560823A (en) * 2004-02-19 2005-01-05 李春林 Data encipher and decipher system based on dynamic variable-length code
CN101005603A (en) * 2006-01-18 2007-07-25 华中科技大学 Method and device for enciphering, deenciphering and transfer code of image data
CN101465724A (en) * 2009-01-06 2009-06-24 中国科学院软件研究所 Encrypted Huffman encoding method and decoding method
US20160248440A1 (en) * 2015-02-11 2016-08-25 Daniel Greenfield System and method for compressing data using asymmetric numeral systems with probability distributions
US20170164007A1 (en) * 2015-12-07 2017-06-08 Google Inc. Mixed boolean-token ans coefficient coding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JAREK DUDA: "Asymmetric numeral systems:entropy coding combining speed of Hu man coding with compression rate of arithmetic coding", 《HTTP://ARXIV.ORG/ABS/1311.2540》 *
JUHA KARKKAINEN: "Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems", 《HTTP://COURSES.HELSINKI.FI/SITES/DEFAULT/FILES/COURSE-MATERIAL/4524834/DCT-LECTURE04.PDF》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113572479A (en) * 2021-09-22 2021-10-29 苏州浪潮智能科技有限公司 Method and system for generating finite state entropy coding table
CN113572479B (en) * 2021-09-22 2021-12-21 苏州浪潮智能科技有限公司 Method and system for generating finite state entropy coding table
WO2023045204A1 (en) * 2021-09-22 2023-03-30 苏州浪潮智能科技有限公司 Method and system for generating finite state entropy coding table, medium, and device
CN116933734A (en) * 2023-09-15 2023-10-24 山东济矿鲁能煤电股份有限公司阳城煤矿 Intelligent diagnosis method for cutter faults of shield machine
CN116933734B (en) * 2023-09-15 2023-12-19 山东济矿鲁能煤电股份有限公司阳城煤矿 Intelligent diagnosis method for cutter faults of shield machine

Also Published As

Publication number Publication date
CN110602498B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
JP3017379B2 (en) Encoding method, encoding device, decoding method, decoder, data compression device, and transition machine generation method
US5045852A (en) Dynamic model selection during data compression
US4901075A (en) Method and apparatus for bit rate reduction
US7365658B2 (en) Method and apparatus for lossless run-length data encoding
US20060171533A1 (en) Method and apparatus for encoding and decoding key data
CN110602498B (en) Self-adaptive finite state entropy coding method
US5594435A (en) Permutation-based data compression
KR20120018360A (en) Method for variable length coding and apparatus
US6788224B2 (en) Method for numeric compression and decompression of binary data
CN112995199B (en) Data encoding and decoding method, device, transmission system, terminal equipment and storage medium
CN113630125A (en) Data compression method, data encoding method, data decompression method, data encoding device, data decompression device, electronic equipment and storage medium
JP2007318772A (en) Coding method and apparatus with at least two parallel coding steps and improved permutation, and corresponding decoding method and apparatus
CN116471337A (en) Message compression and decompression method and device based on BWT and LZW
CN113922947B (en) Self-adaptive symmetrical coding method and system based on weighted probability model
JP2023036033A (en) Data encoding method, encoder, and data encoding method
US6101281A (en) Method for improving data encoding and decoding efficiency
US20220060196A1 (en) Data compression using reduced numbers of occurrences
EP3767457A1 (en) Data communication
Li et al. A Novel ANS Coding with Low Computational Complexity
EP3767469A1 (en) Data communication
JPH0629861A (en) Data compression method
JPH0884260A (en) Compression system and expansion system for two-dimension image data
Palunčić et al. Quasi-Enumerative Coding of Balanced Run-Length Limited Codes
Jiang Parallel design of Q-coders for bilevel image compression
Wang High Efficient and Real Time Huffman Codec Used in Handwriting Short Message Service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant