CN106452451A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN106452451A
CN106452451A CN201610701324.7A CN201610701324A CN106452451A CN 106452451 A CN106452451 A CN 106452451A CN 201610701324 A CN201610701324 A CN 201610701324A CN 106452451 A CN106452451 A CN 106452451A
Authority
CN
China
Prior art keywords
matrix
binary number
encoded
character
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610701324.7A
Other languages
Chinese (zh)
Other versions
CN106452451B (en
Inventor
王杰林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Qiannian Huaguang Software Development Co Ltd
Original Assignee
Hunan Qiannian Huaguang Software Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Qiannian Huaguang Software Development Co Ltd filed Critical Hunan Qiannian Huaguang Software Development Co Ltd
Priority to CN201610701324.7A priority Critical patent/CN106452451B/en
Publication of CN106452451A publication Critical patent/CN106452451A/en
Application granted granted Critical
Publication of CN106452451B publication Critical patent/CN106452451B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Abstract

The embodiment of the invention provides a data processing method and device. The method comprises the steps of converting a non-sparse matrix into multiple unit matrixes; processing the unit matrixes according to a preset elimination rule, thereby forming multiple corresponding binary numbers; carrying out entropy coding on the binary numbers, thereby forming multiple compressed binary numbers; if the compressed binary numbers comprise multiple continuous 1, converting the compressed binary numbers according to a preset conversion rule, thereby obtaining converted binary numbers; and coding the converted binary numbers according to a preset coding rule, thereby obtaining coding output. According to the method, a data compression rate is further improved, the compression effect is better, and original data can be restored without loss.

Description

Data processing method and device
Technical field
The present invention relates to data processing field, in particular to a kind of data processing method and device.
Background technology
Although current interval coding and arithmetic coding can carry out a certain degree of compression to data, its compression ratio is simultaneously Not high.
Content of the invention
In view of this, a kind of data processing method and device are embodiments provided, to solve the above problems.
In a first aspect, a kind of data processing method provided in an embodiment of the present invention, methods described includes:Non- sparse by one Matrix conversion be multiple cell matrixs, wherein, each described cell matrix respectively with one of described non-sparse matrix symbol Corresponding, in each described cell matrix, the position of element 1 symbol corresponding with described cell matrix is in preset order table Position is corresponding, and described preset order table refers to the table arranging all symbols in described non-sparse matrix according to preset order; By each described cell matrix, after default elimination rule process, form corresponding multiple binary number respectively;Respectively by institute State each binary number and carry out entropy code, form the binary number after multiple compressions;If wrapping in the binary number after described compression Include multiple continuous 1, the binary number after described compression is changed according to default transformation rule, obtain after conversion two and enter Number processed;By the binary number after described conversion, encoded according to pre-arranged code rule, obtained coding output.
Second aspect, a kind of data processing equipment provided in an embodiment of the present invention, described device includes:First modulus of conversion Block, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix is non-dilute with described respectively One of thin matrix symbol is corresponding, the position of element 1 symbol corresponding with described cell matrix in each described cell matrix Position number in preset order table is corresponding, described preset order table refer to by all symbols in described non-sparse matrix according to The table of preset order arrangement;Cancellation module, for by each described cell matrix, eliminating after rule process according to default, respectively Form corresponding multiple binary number;First coding module, for respectively each binary number described being carried out entropy code, forms Binary number after multiple compressions;Second modular converter, if include multiple continuous for the binary number after described compression 1, the binary number after described compression is changed according to default transformation rule, is obtained the binary number after conversion;Second volume Code module, for by the binary number after described conversion, being encoded according to pre-arranged code rule, obtains coding output.
Compared with prior art, a kind of data processing method provided in an embodiment of the present invention and device, by by waiting to compile Code character constitute sparse matrix pre-processed, become multiple cell matrixs, and respectively to unit matrix according to Eliminate rule, be converted to binary number, and respectively described binary number is carried out with the binary number after entropy code obtains compression, and Binary number after compression is changed again, is finally encoded according to pre-arranged code rule and exported into one so that encoding Step is compressed, and compression ratio becomes big, and methods described can be implemented with iteration, can obtain more preferable compression effectiveness.
For enabling the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Brief description
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be attached to use required in embodiment Figure is briefly described it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, and it is right to be therefore not construed as The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is a kind of block diagram of server provided in an embodiment of the present invention.
Fig. 2 is a kind of flow chart of data processing method that first embodiment of the invention provides.
Fig. 3 is the partial process view of step S320 in a kind of data processing method that first embodiment of the invention provides.
Fig. 4 is the partial process view of step S340 in a kind of data processing method that first embodiment of the invention provides.
Fig. 5 is the partial process view of step S350 in a kind of data processing method that first embodiment of the invention provides.
Fig. 6 is a kind of high-level schematic functional block diagram of data processing equipment that second embodiment of the invention provides.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Generally exist The assembly of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged with various different configurations and design herein.Cause This, be not intended to limit claimed invention to the detailed description of the embodiments of the invention providing in the accompanying drawings below Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing The every other embodiment being obtained on the premise of going out creative work, broadly falls into the scope of protection of the invention.
It should be noted that:Similar label and letter represent similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then do not need it to be defined further and explains in subsequent accompanying drawing.Meanwhile, the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or hint relative importance.
As shown in figure 1, being the block diagram of server.Described server includes data processing equipment 210, memory 220th, storage control 230, processor 240.
Described memory 220, storage control 230, each element of processor 240 directly or indirectly electrically connect each other Connect, to realize transmission or the interaction of data.For example, these elements can pass through one or more communication bus or signal each other Line is realized being electrically connected with.Described data processing equipment 210 includes at least one can be in the form of software or firmware (firmware) It is stored in described memory or is solidificated in soft in the operating system (operating system, OS) of described server 200 Part functional module.Described processor 240 is used for executing the executable module of storage in memory 220, for example described data processing Software function module or computer program that device 210 includes.
Wherein, memory 220 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 220 is used for storage program, and described processor 240, after receiving execute instruction, executes described program, aforementioned The method performed by server of the stream process definition that embodiment of the present invention any embodiment discloses can apply in processor, Or realized by processor.
Processor 240 is probably a kind of IC chip, has the disposal ability of signal.Above-mentioned processor can be General processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), special IC (ASIC), ready-made programmable Gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hardware components.Permissible Disclosed each method in realization or the execution embodiment of the present invention, step and logic diagram.General processor can be micro- place Reason device or this processor can also be any conventional processors etc..
The flow chart that Fig. 2 shows a kind of data processing method that first embodiment of the invention provides, methods described includes:
Step S310, a non-sparse matrix is converted to multiple cell matrixs, and wherein, each described cell matrix is respectively Corresponding with one of described non-sparse matrix symbol, the position of element 1 and described unit square in each described cell matrix Position in preset order table for the corresponding symbol of battle array is corresponding, and described preset order table refers to institute in described non-sparse matrix There is the table that symbol arranges according to preset order.
The embodiment of step S310 has multiple, one kind is described below it is to be understood that being not limited thereto.
Implementation steps are as described below:
Step 1:If non-sparse matrix P is to have LpThe set of individual symbol, preset order table SpMiddle institute for non-sparse matrix P There are the table that symbol arranges, P according to preset orderi(i is less than equal to L ∈ Pp- 1 containing 0 natural number), i is symbol PiSequence number.
Step 2:If Q is the set of all sequences of length h (h >=1) that in non-sparse matrix P, symbol is formed, SQIt is In Q in set, element presses SpThe table of literary name canonical ordering arrangement, its length is LSQ.As h=1, be equivalent to the process to data be with Element in data productive set is carried out for unit;Work as h>When 1, the process to data is according to element in data productive set The length of composition is that the sequence of h is carried out for unit.
Step 3:Let R be the data sequence that in set Q, element produces, its length is LR.
Step 4:Let c be the matrix of m*n (m, n are positive integers), each of which Elements Cij(i ∈ [0, m-1], j ∈ [0, n-1]) It is that the subsequence that the length got off is h is intercepted in order from R, and being arranged sequentially in matrix R by matrix element, therefore have Cij∈Q.
Step 5:By SQThe order of table, extracts S from Matrix CQElement S in tableQ[k](k∈[0,LSQ- 1] cell matrix) C_SQ[k].C_SQ[k] is m*n matrix, and each Elements C _ SQ[k]ijThe value rule of (i ∈ [0, m-1], j ∈ [0, n-1]) It is:As i ∈ [0, m-1], j ∈ [0, n-1], k ∈ [0, L_SQ-1], if Cij=SQWhen [k], C_SQ[k]ij=1;Otherwise C_SQ [k]ij=0.So, Matrix C escape has been become LSQIndividual cell matrix group.
Step 6:Repeat step (1) and arrive (5).If when the end of R, when its residue length is inadequate, mended with " 0 " Foot.
Step S320, each described cell matrix, after default elimination rule process, forms corresponding multiple respectively Binary number.
Refer to Fig. 3, as a kind of embodiment, the described default rule that eliminates includes:
Step S321, first described cell matrix is scanned formation binary number in order, and by described in first Cell matrix is designated as eliminating matrix.
Step S322, using second described cell matrix as matrix to be canceled;Acquisition has eliminated element 1 in matrix to be occurred Position, be designated as eliminate position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted; Scan surplus element in described matrix to be canceled, form binary number.
Step S323, second described cell matrix is designated as eliminating matrix, and continues with next cell matrix, Until all cell matrixs are disposed.
It is understood that in the cell matrix obtaining in order, symbol occurs in the corresponding cell matrix of nth symbol Number 1 position can be in follow-up LpIt is bound to be 0 in-n cell matrix, so, Lp- n cell matrix can above be occurred Cross 1 position elimination.
For example:In the original matrix of 256*256, there are 256 elements (i.e. 0 to 255).The corresponding cell matrix of symbol 0, Be 256*256 original matrix in, when symbol 0, current location puts 1;It is otherwise 0.The cell matrix of so symbol 1, can With the cell matrix extracting after removing symbol 0.
Each binary number described is carried out entropy code by step S330 respectively, forms the binary number after multiple compressions.
Wherein, the embodiment of entropy code have multiple, for example, Huffman encoding, algorithm coding or Interval Coding etc..
Below, illustrate taking algorithm coding as a example:
In two input characters, probability of occurrence larger for MPS (More Probable Symbol), the probability of MPS For Pe;Probability of occurrence less for LPS (Less Probable Symbol), the probability of LPS is Qe, Pe=1-Qe.As above-mentioned After binarization, 0 is exactly MPS, and 1 is LPS;If carrying out binarization in turn, then 1 is exactly MPS, 0 is LPS;
During coding, two special registers (C, A) are set, and the value of C register is encoded point (at pointer indication), changes at the beginning For 0, as interval lower limit.The value of A-register is the width (probability of this width precisely incoming symbol string) in subinterval, Turn to 1 at the beginning, then the interval upper limit is exactly C+A.
With being encoded data source input, the content of C and A presses following coding rule correction:
When low probability symbol LPS arrives:C=C, A=AQe.
When high probability symbols MPS arrive:C=C+AQe, A=Ape=A (1-Qe).
Example:Source symbol sequence 11011111
0 is LPS:Qe=1/8=(0.001) b
1 is MPS:Pe=7/8=(0.111) b
Original state:C=0 (subinterval original position) A=1 (subinterval width)
(1) the 1st symbol 1 is MPS
C=C+AQe=0+1 0.001=0.001
A=APe=1 0.111=0.111
(2) the 2nd symbols 1 are still MPS
C=C+AQe=0.001+0.111 0.001=0.001111
A=APe=0.111 0.111=0.110001
(3) the 3rd symbols 0 are LPS
C=C=0.001111
A=A Qe=0.110001 0.001=0.000110001
The like. finally obtain:
C=0.010001111110111100000001
A=0.000011001001000010111111
Now interval tail is C+A=0.010101000111111111000000, coding interval [C, C+A)
Binary number after compression can be the arbitrarily small numerical value in last coding interval, but best in order to obtain Code efficiency, the decimal of selection should have bit length the shortest.0.0101 is can use, that is, the binary number after compressing in above-mentioned example For 0101.
Step S340, if the binary number after described compression includes multiple continuous 1, by the binary system after described compression Number is changed according to default transformation rule, obtains the binary number after conversion.
Refer to Fig. 4, as a kind of embodiment, described default transformation rule includes:
Step S341, the value of the first label is set to 0, and the value of the second label is set to 1.
Step S342, using first of the binary number after described compression as currently character to be encoded.
Step S343, if described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as second The value of label, and the value of described first label and described second label is exchanged;If described character currently to be encoded is 0, institute State the value that the corresponding coding of character to be encoded is output as the first label, described first label is constant with the value of described second label.
Step S345, using next character of described character currently to be encoded as currently character to be encoded, waits to compile to described Code character is changed, and the character of the binary number after described compression all converts, and obtains the binary system after conversion Number.
For example:As a example binary number 1011110010100001 after compression:
Assume the first label index_0=0, the second label index_1=1, bit are binary value currently to be encoded,
If bit=1, bit=index_1, index_1=0, index_0=1;
If bit=0, bit=index_0, index_0=0;Index_1=1.
The like, after the binary number 1011110010100001 after compression is changed, obtain two after conversion System number is:1110001011110001.
Further, the binary number after described conversion, repeatedly according to default transformation rule, can be changed and remembered Record conversion times.
Connect example, be converted into 1001001110001001 according still further to above-mentioned same method by 1110001011110001, Now be not in continuous 4 in 1001001110001001 strings and appear above 0 and 1.Now, record conversion number of times Times= 2.
Step S350, the binary number after described conversion is encoded according to pre-arranged code rule, is obtained coding defeated Go out.
Refer to Fig. 5, as a kind of embodiment, described pre-arranged code rule includes:
Step S351, carries out the space after spatial spread is expanded to initial code space, after described conversion The static statistics model of the character in binary number, divides to the initial code space after extension, currently waits to compile to obtain The corresponding space encoder of code character;
Step S352, to described present encoding character, corresponding space encoder is extended, and the coding after being expanded is empty Between;According to the statistical model of described character, the space encoder after described extension is divided, to obtain next character to be encoded Corresponding space encoder;
Step S353, using next character to be encoded as currently character to be encoded, the binary number after described conversion All coding finishes middle character, obtains coding result;
Step S354, using described coding result, data to be encoded length and the first statistical parameter as coding output, institute State the first statistical parameter be comprise in described data to be encoded 1 number.
For example:If the binary number after conversion is 1010000110010101000100010, two after described conversion are entered Number processed is encoded according to pre-arranged code rule, and coding step is as follows:
Define S and represent assemble of symbol;LSRepresent S set symbol number;The probability that so each symbol occurs entirely is pressed According toCalculated, the lower limit of L current interval;The interval upper limit of H present encoding;R is present encoding interval size, wherein R= H-L;Len represents the total length of data to be compressed.RmaxInitial code space is a positive integer, in arithmetic coding is 1.
First, initialize relevant parameter, due to only having 0 and 1 in current character string, so S ∈ { 0,1 }, then LS= 2.Define Rmax=100000000000 it is to be appreciated that RmaxValue can be relatively large, T0=Ls, fk=1, k ∈ [0, Ls) I.e. f0=1, f1=1, H0=R0=Rmax、L0=0.Set α0=1.1 adopt static coefficient, i.e. α heren0.Len=0 (waits to compile Code data length), Count=0 (the first statistical parameter comprises 1 number in described data to be encoded).Empty to initial code Between be extended being expanded after space R0=Rmax0=110000000000.According to the static statistics model of described character, Initial code space after extension is divided, obtains
U′0=[0,54999999999], U '1=[55000000000,110000000000].
Then, obtain the 1st character 1 to be encoded.Now the corresponding space encoder of character 1 to be encoded is U '1, treat volume The corresponding space encoder of code character 1 is U '1It is extended, according to formula
Obtain R1=30250000000;Space encoder after being expanded;According to the statistical model of described character, to described Space encoder after extension obtains after being divided
U′0=[55000000000,85249999999], U '1=[85250000000,115500000000].
And update statistical value:Count=Count+1, Len=Len+1.
Then obtain the 2nd character 0 to be encoded, calculate R in the same manner2=16637500000, the coding after being expanded is empty Between;According to the statistical model of described character, after the space encoder after described extension is divided, obtain U '0= [55000000000,71637499999], U '1=[71637500000,88275000000].And update statistical value:Count= Count+0 (because now character to be encoded is 0, the therefore value of count is not added with 1), Len=Len+1.
By that analogy, last coding result value V '=730429, work as αn0When=1.1, compared with traditional coding result 63118085, few 2 numerical value, improve 25% compression ratio.
If according to repeatedly according to default transformation rule in step S340, being changed and be have recorded conversion times Times, Then now by V ', Count, Len, Times complete as coding output, now coding.
It is understood that the embodiment of the present invention is lossless coding, after getting described coding output, can carry out inverse It is decoded to calculating, initial data can be restored.Decoding process is as follows:
The first step:According to described coding output, the inverse operation according to pre-arranged code rule is decoded, after being changed Binary number:
First:Initialization relevant parameter, due to only having 0 and 1 in current character string, so S ∈ { 0,1 }, then Ls= 2.Define Rmax=100000000000 it is to be appreciated that RmaxValue can be relatively large, T0=Ls, fk=1, k ∈ [0, Ls) I.e. f0=1, f1=1, H0=R0=Rmax、L0=0.Set α0=1.1 adopt static coefficient, i.e. α heren0.Len=0, Count=0;Due to being α0=1.1, y (n) ≈ 1.So taking y (n)=1, Ty (n)=t or
Obtain Count=9 (the first statistical parameter comprises 1 number in described data to be encoded), Len=25 (waits to compile Code data length) and coding result V '=730429.
According to formula:
Obtain current solution code space, t=T=55004691494.Coding result V ' and t is compared.Now V ' > 550046, so output 1;Count=Count-1 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Then, draw t=T=85252570554 according to 1100000000000000001111111.V ' < 852525, defeated Go out 0;;Count=Count-0 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Then, draw t=T=71640070554 according to 1010000000000000001111111.V ' > 716400, defeated Go out symbol 1;Count=Count-1 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Now, draw t=T=80789529037 according to 1011000000000000000111111.V ' < 807895, defeated Go out symbol 0;
By that analogy, using 1010000000000000001111111 and 1010111111100000000000000 continuation Decoding.Decode as Len=0 and terminate, the binary number obtaining after changing is 1010000110010101000100010.
Second step:To the binary number after described conversion, the inverse operation according to described default transformation rule is changed, and obtains Binary number after must compressing.If containing conversion times Times in coding output, need to carry out Times inverse operation.
Such as 1001001110001001, after first time inverse conversion, obtain the binary number after the compression of decoding for the first time 1110001011110001, then according to same method is by 1110001011110001 often second inverse operation, obtain pressure Binary number 1011110010100001 after contracting.
3rd step:Entropy decoding
It is divided into two subintervals by Qe, Pe, judge that the code word being decoded falls interval at which, and give corresponding symbol.
If c '=(0.0101) b is the value being decoded, initial value A=1Qe=0.001
Work as c ' and fall between 0-QeA, solution code sign is D=0, then C '=C ', A=Qe A
Work as c ' and fall between QeA-A, solution code sign is D=1, then C '=C '-Qe A, A=A (1-Qe)
C '=0.0101 falls between Qe A-A, and solution code sign is D=1
C '=c '-QeA=0.0101-0.001=0.0011, A=A (1-Qe)=0.111
C '=0.0011 falls between Qe A-A, and solution code sign is D=1
C '=c '-QeA=0.0011-0.000111=0.000101,
A=A (1-Qe)=0.111 0.111=0.110001
C '=0.000101 falls between 0-QeA, and solution code sign is D=0
C '=c '=0.000101 A=AQe=0.110001 0.001=0.000110001
Wherein, MPS and LPS given above is the division methods under static models, and this situation can directly be passed through to look into Table draws probable value Qe of LPS, then obtains the probable value of MPS by 1-Qe.Numerical value system can be saved with binary coding Conversion operand.Its principle is identical.
4th step:According to the default inverse operation acquiring unit matrix eliminating rule
Need after obtaining cell matrix binary numeral to insert the character position being eliminated information by the method for interpolation, Obtain all of cell matrix successively., the corresponding cell matrix size one of symbol 0 is set to taking the original matrix of 256*256 as a example 256*256 (wherein only comprises 0 and 1).So 1 number is to determine.So the position that we can will appear from 1 is inserted into symbol On number 1 corresponding cell matrix correspondence position.The cell matrix of symbol 1 is reduced into for 256*256 size.
5th step:Sparse matrix reduces non-sparse matrix
Step 1:Produce the same matrix B of a cell matrix ranks number, B is the matrix of m*n.Produce an empty sequence T.
Step 2:Take cell matrix group C_SQ.
Step 3:For all cell matrix C_SQ[k], works as C_SQ[k]ijWhen=1, Bij=SQ[k] (i ∈ [0, m-1], j ∈[0,n-1],k∈[0,LSQ- 1]), then B=C.
Step 4:By the data convert in matrix B.According to the order of elements of matrix B, take out the element in matrix B, by suitable Sequence is worn and is listed in sequence T.
Step 5:Repeat step (8), (9), (10), until all of correlation unit matrix disposal finishes, and give up sequence T Upper LRData (being to mend " 0 ") after element in individual Q, such T=R
6th step:Iterative decoding
If employing successive ignition during coding to process, similarly need here to be iterated decoding.
Data processing method provided in an embodiment of the present invention, pre- by carrying out to the sparse matrix being made up of character to be encoded Process, become multiple cell matrixs, and regular according to eliminating to unit matrix respectively, be converted to binary number, and Respectively described binary number is carried out with the binary number after entropy code obtains compression, and the binary number after compression is carried out again Conversion, is finally encoded according to pre-arranged code rule and is compressed further so that encoding output, and compression ratio becomes big, and institute The method of stating can be implemented with iteration, can obtain more preferable compression effectiveness, and the inverse operation according to compression algorithm, is capable of no That damages restores initial data.
Refer to Fig. 6, Fig. 6 is that a kind of functional module of data processing equipment 210 that third embodiment of the invention provides is shown It is intended to, described data processing equipment 210 includes the first modular converter 211, cancellation module 212, the first coding module 213, second Modular converter 214 and the second coding module 215.
Described first modular converter 211, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each Described cell matrix is corresponding with one of described non-sparse matrix symbol respectively, element 1 in each described cell matrix Position in preset order table for the position symbol corresponding with described cell matrix is corresponding, and described preset order table refers to institute State the table that in non-sparse matrix, all symbols arrange according to preset order.
Described cancellation module 212, for by each described cell matrix, eliminating after rule process according to default, shape respectively Become corresponding multiple binary number.
Wherein, described default elimination rule includes for first described cell matrix scanning formation binary system in order Number, and first described cell matrix is designated as eliminating matrix;Using second described cell matrix as matrix to be canceled;Obtain Take and eliminate the position that in matrix, element 1 occurs, be designated as eliminating position;According to described elimination position, by described matrix to be canceled The element of middle corresponding position is deleted;Scan surplus element in described matrix to be canceled, form binary number;By described in second Cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all cell matrixs are disposed.
Described first coding module 213, for respectively each binary number described being carried out entropy code, forms multiple compressions Binary number afterwards.
Described second modular converter 214, if include multiple continuous 1 for the binary number after described compression, by institute State the binary number after compression to be changed according to default transformation rule, obtain the binary number after conversion.
Wherein, described default transformation rule includes:The value of the first label is set to 0, the value of the second label is set to 1;Using first of the binary number after described compression as currently character to be encoded;If described character currently to be encoded is 1, The corresponding coding of described character to be encoded is output as the value of the second label, and the value by described first label and described second label Exchange;If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, institute State the first label constant with the value of described second label;Using next character of described character currently to be encoded as currently to be encoded Character, changes to described character to be encoded, and the character of the binary number after described compression all converts, and obtains Binary number after conversion.
Described second coding module 215, for by the binary number after described conversion, being compiled according to pre-arranged code rule Code, obtains coding output.
Each module can be by software code realization above, and now, above-mentioned each module can be stored in depositing of server 200 In reservoir.Each module equally can be realized by hardware such as IC chip above.
It should be noted that each embodiment in this specification is all described by the way of going forward one by one, each embodiment weight Point explanation is all difference with other embodiment, between each embodiment identical similar partly mutually referring to.
The data processing equipment that the embodiment of the present invention is provided, it realizes the technique effect of principle and generation and preceding method Embodiment is identical, and for briefly describing, apparatus and system embodiment part does not refer to part, refers to phase in preceding method embodiment Answer content.
It should be understood that disclosed apparatus and method are it is also possible to pass through in several embodiments provided herein Other modes are realized.Device embodiment described above is only schematically, for example, the flow chart in accompanying drawing and block diagram Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code Part, a part for described module, program segment or code comprises holding of one or more logic function for realizing regulation Row instruction.It should also be noted that at some as in the implementation replaced, the function of being marked in square frame can also be to be different from The order being marked in accompanying drawing occurs.For example, two continuous square frames can essentially execute substantially in parallel, and they are sometimes Can execute in the opposite order, this is depending on involved function.It is also noted that it is every in block diagram and/or flow chart The combination of the square frame in individual square frame and block diagram and/or flow chart, can be with the special base of the function of execution regulation or action System in hardware to be realized, or can be realized with combining of computer instruction with specialized hardware.
In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation Divide or modules individualism is it is also possible to two or more modules are integrated to form an independent part.
If described function realized using in the form of software function module and as independent production marketing or use when, permissible It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words Partly being embodied in the form of software product of part that prior art is contributed or this technical scheme, this meter Calculation machine software product is stored in a storage medium, including some instructions with so that a computer equipment (can be individual People's computer, server, or network equipment etc.) execution each embodiment methods described of the present invention all or part of step. And aforesaid storage medium includes:USB flash disk, portable hard drive, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.Need Illustrate, herein, such as first and second or the like relational terms be used merely to by an entity or operation with Another entity or operation make a distinction, and not necessarily require or imply there is any this reality between these entities or operation The relation on border or order.And, term " inclusion ", "comprising" or its any other variant are intended to the bag of nonexcludability Containing, so that including a series of process of key elements, method, article or equipment not only include those key elements, but also including Other key elements being not expressly set out, or also include for this process, method, article or the intrinsic key element of equipment. In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including described key element Process, method, also there is other identical element in article or equipment.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exist Representing similar terms in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, being then not required in subsequent accompanying drawing It is defined further and to be explained.
The above, the only specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, and any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, all should contain Cover within protection scope of the present invention.Therefore, protection scope of the present invention should described be defined by scope of the claims.

Claims (10)

1. a kind of data processing method is it is characterised in that methods described includes:
One non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix is non-sparse with described respectively One of matrix symbol is corresponding, the position of element 1 symbol corresponding with described cell matrix in each described cell matrix Position in preset order table is corresponding, and described preset order table refers to all symbols in described non-sparse matrix according to pre- If tactic table;
By each described cell matrix, after default elimination rule process, form corresponding multiple binary number respectively;
Respectively each binary number described is carried out entropy code, form the binary number after multiple compressions;
If the binary number after described compression includes multiple continuous 1, the binary number after described compression is turned according to default Change rule to be changed, obtain the binary number after conversion;
By the binary number after described conversion, encoded according to pre-arranged code rule, obtained coding output.
2. method according to claim 1 is it is characterised in that the described default rule that eliminates includes:
First described cell matrix is scanned formation binary number in order, and first described cell matrix is designated as Eliminate matrix;
Using second described cell matrix as matrix to be canceled;Obtain and eliminated the position that in matrix, element 1 occurs, be designated as disappearing Except position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted;Scan described to be canceled Surplus element in matrix, forms binary number;
Second described cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all units Matrix disposal finishes.
3. method according to claim 1 is it is characterised in that described entropy code is arithmetic coding or Interval Coding.
4. method according to claim 1 is it is characterised in that described default transformation rule includes:
The value of the first label is set to 0, the value of the second label is set to 1;
Using first of the binary number after described compression as currently character to be encoded;
If described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as the value of the second label, and will Described first label is exchanged with the value of described second label;
If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, described First label is constant with the value of described second label;
Using next character of described character currently to be encoded as currently character to be encoded, described character to be encoded is carried out turn Change, the character of the binary number after described compression all converts, and obtains the binary number after conversion.
5. method according to claim 4 it is characterised in that described obtain conversion after binary number after, described Method also includes:
By the binary number after described conversion, repeatedly according to default transformation rule, changed and recorded conversion times.
6. method according to claim 1 is it is characterised in that described pre-arranged code rule includes:
Initial code space is carried out with the space after spatial spread is expanded, according to the word in the binary number after described conversion The static statistics model of symbol, divides to the initial code space after extension, to obtain the corresponding volume of currently character to be encoded Code space;
To described present encoding character, corresponding space encoder is extended, the space encoder after being expanded;According to described word The statistical model of symbol, divides to the space encoder after described extension, and to obtain, next character to be encoded is corresponding to encode sky Between;
Using next character to be encoded as currently character to be encoded, in the binary number after described conversion, character all encodes Finish, obtain coding result;
Using described coding result, data to be encoded length and the first statistical parameter as coding output, described first statistics ginseng Number is for comprising 1 number in described data to be encoded.
7. method according to claim 5 is it is characterised in that the output of described coding also includes the number of times of described conversion.
8. a kind of data processing equipment is it is characterised in that described device includes:
First modular converter, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix Corresponding with one of described non-sparse matrix symbol respectively, the position of element 1 and described list in each described cell matrix Position in preset order table for the corresponding symbol of variable matrix is corresponding, and described preset order table refers to described non-sparse matrix In the table that arranges according to preset order of all symbols;
Cancellation module, for by each described cell matrix, after default elimination rule process, forming corresponding multiple respectively Binary number;
First coding module, for respectively each binary number described being carried out entropy code, forms the binary system after multiple compressions Number;
Second modular converter, if including multiple continuous 1 for the binary number after described compression, by two after described compression System number is changed according to default transformation rule, obtains the binary number after conversion;
Second coding module, for by the binary number after described conversion, being encoded according to pre-arranged code rule, being obtained coding Output.
9. device according to claim 8 is it is characterised in that the described default rule that eliminates includes:
First described cell matrix is scanned formation binary number in order, and first described cell matrix is designated as Eliminate matrix;
Using second described cell matrix as matrix to be canceled;Obtain and eliminated the position that in matrix, element 1 occurs, be designated as disappearing Except position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted;Scan described to be canceled Surplus element in matrix, forms binary number;
Second described cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all units Matrix disposal finishes.
10. device according to claim 8 is it is characterised in that described default transformation rule includes:
The value of the first label is set to 0, the value of the second label is set to 1;
Using first of the binary number after described compression as currently character to be encoded;
If described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as the value of the second label, and will Described first label is exchanged with the value of described second label;
If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, described First label is constant with the value of described second label;
Using next character of described character currently to be encoded as currently character to be encoded, described character to be encoded is carried out turn Change, the character of the binary number after described compression all converts, and obtains the binary number after conversion.
CN201610701324.7A 2016-08-22 2016-08-22 Data processing method and device Expired - Fee Related CN106452451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610701324.7A CN106452451B (en) 2016-08-22 2016-08-22 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610701324.7A CN106452451B (en) 2016-08-22 2016-08-22 Data processing method and device

Publications (2)

Publication Number Publication Date
CN106452451A true CN106452451A (en) 2017-02-22
CN106452451B CN106452451B (en) 2019-09-13

Family

ID=58181396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610701324.7A Expired - Fee Related CN106452451B (en) 2016-08-22 2016-08-22 Data processing method and device

Country Status (1)

Country Link
CN (1) CN106452451B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109474281A (en) * 2018-09-30 2019-03-15 湖南瑞利德信息科技有限公司 Data encoding, coding/decoding method and device
CN109525249A (en) * 2018-09-30 2019-03-26 湖南瑞利德信息科技有限公司 Coding-decoding method, system, readable storage medium storing program for executing and computer equipment
WO2023061177A1 (en) * 2021-10-12 2023-04-20 深圳智慧林网络科技有限公司 Multi-data sending method, apparatus and device based on columnar data scanning, and multi-data receiving method, apparatus and device based on columnar data scanning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699646A (en) * 2013-12-24 2014-04-02 吕志强 Tagging reversible compression method for binary data
CN104753626A (en) * 2013-12-25 2015-07-01 华为技术有限公司 Data compression method, equipment and system
US20160004715A1 (en) * 2014-07-02 2016-01-07 International Business Machines Corporation Minimizing Metadata Representation In A Compressed Storage System
CN105791832A (en) * 2016-03-08 2016-07-20 湖南千年华光软件开发有限公司 Data coding method, data decoding method, data coding system and data decoding system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699646A (en) * 2013-12-24 2014-04-02 吕志强 Tagging reversible compression method for binary data
CN104753626A (en) * 2013-12-25 2015-07-01 华为技术有限公司 Data compression method, equipment and system
US20160004715A1 (en) * 2014-07-02 2016-01-07 International Business Machines Corporation Minimizing Metadata Representation In A Compressed Storage System
CN105791832A (en) * 2016-03-08 2016-07-20 湖南千年华光软件开发有限公司 Data coding method, data decoding method, data coding system and data decoding system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王防修 等: "通过哈弗曼编码实现文件的压缩与解压", 《武汉工业学院学报》 *
陈鹏 等: "基于约束等距的块稀疏压缩采样匹配追踪算法", 《系统工程与电子技术》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109474281A (en) * 2018-09-30 2019-03-15 湖南瑞利德信息科技有限公司 Data encoding, coding/decoding method and device
CN109525249A (en) * 2018-09-30 2019-03-26 湖南瑞利德信息科技有限公司 Coding-decoding method, system, readable storage medium storing program for executing and computer equipment
CN109525249B (en) * 2018-09-30 2023-10-27 湖南瑞利德信息科技有限公司 Encoding and decoding method, system, readable storage medium and computer device
WO2023061177A1 (en) * 2021-10-12 2023-04-20 深圳智慧林网络科技有限公司 Multi-data sending method, apparatus and device based on columnar data scanning, and multi-data receiving method, apparatus and device based on columnar data scanning

Also Published As

Publication number Publication date
CN106452451B (en) 2019-09-13

Similar Documents

Publication Publication Date Title
JP2544895B2 (en) Distributed data processing system
US20210274225A1 (en) Methods and apparatuses for encoding and decoding a bytestream
US7921145B2 (en) Extending a repetition period of a random sequence
TW303549B (en)
CN106452451A (en) Data processing method and device
CN110021369B (en) Gene sequencing data compression and decompression method, system and computer readable medium
CN106301385A (en) The method and apparatus carrying out reasonable compression and decompression for logarithm
CN106445890A (en) Data processing method
JPH08167852A (en) Method and device for compressing data
Yang et al. Universal lossless data compression with side information by using a conditional MPM grammar transform
Arming et al. Data compression in hardware—The Burrows-Wheeler approach
CN106484753A (en) Data processing method
JP4758494B2 (en) Circuit and method for converting bit length to code
CN109698703B (en) Gene sequencing data decompression method, system and computer readable medium
CN109831544A (en) A kind of coding and storing method and system applied to E-mail address
CN108829930A (en) The light weight method of three-dimensional digital technological design MBD model
Howard et al. Parallel lossless image compression using Huffman and arithmetic coding
CN110111851B (en) Gene sequencing data compression method, system and computer readable medium
Barbay From time to space: Fast algorithms that yield small and fast data structures
CN105931278A (en) Methods And Apparatus For Two-dimensional Block Bit-stream Compression And Decompression
Park et al. Encoding weights of irregular sparsity for fixed-to-fixed model compression
CN206712982U (en) A kind of Huffman coded systems for VLSI designs
JPH05241775A (en) Data compression system
US11907329B2 (en) Convolution calculation apparatus and method
CN109660262A (en) A kind of character coding method and system applied to E-mail address

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190913

Termination date: 20200822

CF01 Termination of patent right due to non-payment of annual fee