CN106452451A - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN106452451A CN106452451A CN201610701324.7A CN201610701324A CN106452451A CN 106452451 A CN106452451 A CN 106452451A CN 201610701324 A CN201610701324 A CN 201610701324A CN 106452451 A CN106452451 A CN 106452451A
- Authority
- CN
- China
- Prior art keywords
- matrix
- binary number
- encoded
- character
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
Abstract
The embodiment of the invention provides a data processing method and device. The method comprises the steps of converting a non-sparse matrix into multiple unit matrixes; processing the unit matrixes according to a preset elimination rule, thereby forming multiple corresponding binary numbers; carrying out entropy coding on the binary numbers, thereby forming multiple compressed binary numbers; if the compressed binary numbers comprise multiple continuous 1, converting the compressed binary numbers according to a preset conversion rule, thereby obtaining converted binary numbers; and coding the converted binary numbers according to a preset coding rule, thereby obtaining coding output. According to the method, a data compression rate is further improved, the compression effect is better, and original data can be restored without loss.
Description
Technical field
The present invention relates to data processing field, in particular to a kind of data processing method and device.
Background technology
Although current interval coding and arithmetic coding can carry out a certain degree of compression to data, its compression ratio is simultaneously
Not high.
Content of the invention
In view of this, a kind of data processing method and device are embodiments provided, to solve the above problems.
In a first aspect, a kind of data processing method provided in an embodiment of the present invention, methods described includes:Non- sparse by one
Matrix conversion be multiple cell matrixs, wherein, each described cell matrix respectively with one of described non-sparse matrix symbol
Corresponding, in each described cell matrix, the position of element 1 symbol corresponding with described cell matrix is in preset order table
Position is corresponding, and described preset order table refers to the table arranging all symbols in described non-sparse matrix according to preset order;
By each described cell matrix, after default elimination rule process, form corresponding multiple binary number respectively;Respectively by institute
State each binary number and carry out entropy code, form the binary number after multiple compressions;If wrapping in the binary number after described compression
Include multiple continuous 1, the binary number after described compression is changed according to default transformation rule, obtain after conversion two and enter
Number processed;By the binary number after described conversion, encoded according to pre-arranged code rule, obtained coding output.
Second aspect, a kind of data processing equipment provided in an embodiment of the present invention, described device includes:First modulus of conversion
Block, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix is non-dilute with described respectively
One of thin matrix symbol is corresponding, the position of element 1 symbol corresponding with described cell matrix in each described cell matrix
Position number in preset order table is corresponding, described preset order table refer to by all symbols in described non-sparse matrix according to
The table of preset order arrangement;Cancellation module, for by each described cell matrix, eliminating after rule process according to default, respectively
Form corresponding multiple binary number;First coding module, for respectively each binary number described being carried out entropy code, forms
Binary number after multiple compressions;Second modular converter, if include multiple continuous for the binary number after described compression
1, the binary number after described compression is changed according to default transformation rule, is obtained the binary number after conversion;Second volume
Code module, for by the binary number after described conversion, being encoded according to pre-arranged code rule, obtains coding output.
Compared with prior art, a kind of data processing method provided in an embodiment of the present invention and device, by by waiting to compile
Code character constitute sparse matrix pre-processed, become multiple cell matrixs, and respectively to unit matrix according to
Eliminate rule, be converted to binary number, and respectively described binary number is carried out with the binary number after entropy code obtains compression, and
Binary number after compression is changed again, is finally encoded according to pre-arranged code rule and exported into one so that encoding
Step is compressed, and compression ratio becomes big, and methods described can be implemented with iteration, can obtain more preferable compression effectiveness.
For enabling the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate
Appended accompanying drawing, is described in detail below.
Brief description
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be attached to use required in embodiment
Figure is briefly described it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, and it is right to be therefore not construed as
The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this
A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 is a kind of block diagram of server provided in an embodiment of the present invention.
Fig. 2 is a kind of flow chart of data processing method that first embodiment of the invention provides.
Fig. 3 is the partial process view of step S320 in a kind of data processing method that first embodiment of the invention provides.
Fig. 4 is the partial process view of step S340 in a kind of data processing method that first embodiment of the invention provides.
Fig. 5 is the partial process view of step S350 in a kind of data processing method that first embodiment of the invention provides.
Fig. 6 is a kind of high-level schematic functional block diagram of data processing equipment that second embodiment of the invention provides.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Ground description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Generally exist
The assembly of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged with various different configurations and design herein.Cause
This, be not intended to limit claimed invention to the detailed description of the embodiments of the invention providing in the accompanying drawings below
Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing
The every other embodiment being obtained on the premise of going out creative work, broadly falls into the scope of protection of the invention.
It should be noted that:Similar label and letter represent similar terms in following accompanying drawing, therefore, once a certain Xiang Yi
It is defined in individual accompanying drawing, then do not need it to be defined further and explains in subsequent accompanying drawing.Meanwhile, the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or hint relative importance.
As shown in figure 1, being the block diagram of server.Described server includes data processing equipment 210, memory
220th, storage control 230, processor 240.
Described memory 220, storage control 230, each element of processor 240 directly or indirectly electrically connect each other
Connect, to realize transmission or the interaction of data.For example, these elements can pass through one or more communication bus or signal each other
Line is realized being electrically connected with.Described data processing equipment 210 includes at least one can be in the form of software or firmware (firmware)
It is stored in described memory or is solidificated in soft in the operating system (operating system, OS) of described server 200
Part functional module.Described processor 240 is used for executing the executable module of storage in memory 220, for example described data processing
Software function module or computer program that device 210 includes.
Wherein, memory 220 may be, but not limited to, random access memory (Random Access Memory,
RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only
Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM),
Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Wherein, memory 220 is used for storage program, and described processor 240, after receiving execute instruction, executes described program, aforementioned
The method performed by server of the stream process definition that embodiment of the present invention any embodiment discloses can apply in processor,
Or realized by processor.
Processor 240 is probably a kind of IC chip, has the disposal ability of signal.Above-mentioned processor can be
General processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network
Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), special IC (ASIC), ready-made programmable
Gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hardware components.Permissible
Disclosed each method in realization or the execution embodiment of the present invention, step and logic diagram.General processor can be micro- place
Reason device or this processor can also be any conventional processors etc..
The flow chart that Fig. 2 shows a kind of data processing method that first embodiment of the invention provides, methods described includes:
Step S310, a non-sparse matrix is converted to multiple cell matrixs, and wherein, each described cell matrix is respectively
Corresponding with one of described non-sparse matrix symbol, the position of element 1 and described unit square in each described cell matrix
Position in preset order table for the corresponding symbol of battle array is corresponding, and described preset order table refers to institute in described non-sparse matrix
There is the table that symbol arranges according to preset order.
The embodiment of step S310 has multiple, one kind is described below it is to be understood that being not limited thereto.
Implementation steps are as described below:
Step 1:If non-sparse matrix P is to have LpThe set of individual symbol, preset order table SpMiddle institute for non-sparse matrix P
There are the table that symbol arranges, P according to preset orderi(i is less than equal to L ∈ Pp- 1 containing 0 natural number), i is symbol PiSequence number.
Step 2:If Q is the set of all sequences of length h (h >=1) that in non-sparse matrix P, symbol is formed, SQIt is
In Q in set, element presses SpThe table of literary name canonical ordering arrangement, its length is LSQ.As h=1, be equivalent to the process to data be with
Element in data productive set is carried out for unit;Work as h>When 1, the process to data is according to element in data productive set
The length of composition is that the sequence of h is carried out for unit.
Step 3:Let R be the data sequence that in set Q, element produces, its length is LR.
Step 4:Let c be the matrix of m*n (m, n are positive integers), each of which Elements Cij(i ∈ [0, m-1], j ∈ [0, n-1])
It is that the subsequence that the length got off is h is intercepted in order from R, and being arranged sequentially in matrix R by matrix element, therefore have
Cij∈Q.
Step 5:By SQThe order of table, extracts S from Matrix CQElement S in tableQ[k](k∈[0,LSQ- 1] cell matrix)
C_SQ[k].C_SQ[k] is m*n matrix, and each Elements C _ SQ[k]ijThe value rule of (i ∈ [0, m-1], j ∈ [0, n-1])
It is:As i ∈ [0, m-1], j ∈ [0, n-1], k ∈ [0, L_SQ-1], if Cij=SQWhen [k], C_SQ[k]ij=1;Otherwise C_SQ
[k]ij=0.So, Matrix C escape has been become LSQIndividual cell matrix group.
Step 6:Repeat step (1) and arrive (5).If when the end of R, when its residue length is inadequate, mended with " 0 "
Foot.
Step S320, each described cell matrix, after default elimination rule process, forms corresponding multiple respectively
Binary number.
Refer to Fig. 3, as a kind of embodiment, the described default rule that eliminates includes:
Step S321, first described cell matrix is scanned formation binary number in order, and by described in first
Cell matrix is designated as eliminating matrix.
Step S322, using second described cell matrix as matrix to be canceled;Acquisition has eliminated element 1 in matrix to be occurred
Position, be designated as eliminate position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted;
Scan surplus element in described matrix to be canceled, form binary number.
Step S323, second described cell matrix is designated as eliminating matrix, and continues with next cell matrix,
Until all cell matrixs are disposed.
It is understood that in the cell matrix obtaining in order, symbol occurs in the corresponding cell matrix of nth symbol
Number 1 position can be in follow-up LpIt is bound to be 0 in-n cell matrix, so, Lp- n cell matrix can above be occurred
Cross 1 position elimination.
For example:In the original matrix of 256*256, there are 256 elements (i.e. 0 to 255).The corresponding cell matrix of symbol 0,
Be 256*256 original matrix in, when symbol 0, current location puts 1;It is otherwise 0.The cell matrix of so symbol 1, can
With the cell matrix extracting after removing symbol 0.
Each binary number described is carried out entropy code by step S330 respectively, forms the binary number after multiple compressions.
Wherein, the embodiment of entropy code have multiple, for example, Huffman encoding, algorithm coding or Interval Coding etc..
Below, illustrate taking algorithm coding as a example:
In two input characters, probability of occurrence larger for MPS (More Probable Symbol), the probability of MPS
For Pe;Probability of occurrence less for LPS (Less Probable Symbol), the probability of LPS is Qe, Pe=1-Qe.As above-mentioned
After binarization, 0 is exactly MPS, and 1 is LPS;If carrying out binarization in turn, then 1 is exactly MPS, 0 is LPS;
During coding, two special registers (C, A) are set, and the value of C register is encoded point (at pointer indication), changes at the beginning
For 0, as interval lower limit.The value of A-register is the width (probability of this width precisely incoming symbol string) in subinterval,
Turn to 1 at the beginning, then the interval upper limit is exactly C+A.
With being encoded data source input, the content of C and A presses following coding rule correction:
When low probability symbol LPS arrives:C=C, A=AQe.
When high probability symbols MPS arrive:C=C+AQe, A=Ape=A (1-Qe).
Example:Source symbol sequence 11011111
0 is LPS:Qe=1/8=(0.001) b
1 is MPS:Pe=7/8=(0.111) b
Original state:C=0 (subinterval original position) A=1 (subinterval width)
(1) the 1st symbol 1 is MPS
C=C+AQe=0+1 0.001=0.001
A=APe=1 0.111=0.111
(2) the 2nd symbols 1 are still MPS
C=C+AQe=0.001+0.111 0.001=0.001111
A=APe=0.111 0.111=0.110001
(3) the 3rd symbols 0 are LPS
C=C=0.001111
A=A Qe=0.110001 0.001=0.000110001
The like. finally obtain:
C=0.010001111110111100000001
A=0.000011001001000010111111
Now interval tail is C+A=0.010101000111111111000000, coding interval [C, C+A)
Binary number after compression can be the arbitrarily small numerical value in last coding interval, but best in order to obtain
Code efficiency, the decimal of selection should have bit length the shortest.0.0101 is can use, that is, the binary number after compressing in above-mentioned example
For 0101.
Step S340, if the binary number after described compression includes multiple continuous 1, by the binary system after described compression
Number is changed according to default transformation rule, obtains the binary number after conversion.
Refer to Fig. 4, as a kind of embodiment, described default transformation rule includes:
Step S341, the value of the first label is set to 0, and the value of the second label is set to 1.
Step S342, using first of the binary number after described compression as currently character to be encoded.
Step S343, if described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as second
The value of label, and the value of described first label and described second label is exchanged;If described character currently to be encoded is 0, institute
State the value that the corresponding coding of character to be encoded is output as the first label, described first label is constant with the value of described second label.
Step S345, using next character of described character currently to be encoded as currently character to be encoded, waits to compile to described
Code character is changed, and the character of the binary number after described compression all converts, and obtains the binary system after conversion
Number.
For example:As a example binary number 1011110010100001 after compression:
Assume the first label index_0=0, the second label index_1=1, bit are binary value currently to be encoded,
If bit=1, bit=index_1, index_1=0, index_0=1;
If bit=0, bit=index_0, index_0=0;Index_1=1.
The like, after the binary number 1011110010100001 after compression is changed, obtain two after conversion
System number is:1110001011110001.
Further, the binary number after described conversion, repeatedly according to default transformation rule, can be changed and remembered
Record conversion times.
Connect example, be converted into 1001001110001001 according still further to above-mentioned same method by 1110001011110001,
Now be not in continuous 4 in 1001001110001001 strings and appear above 0 and 1.Now, record conversion number of times Times=
2.
Step S350, the binary number after described conversion is encoded according to pre-arranged code rule, is obtained coding defeated
Go out.
Refer to Fig. 5, as a kind of embodiment, described pre-arranged code rule includes:
Step S351, carries out the space after spatial spread is expanded to initial code space, after described conversion
The static statistics model of the character in binary number, divides to the initial code space after extension, currently waits to compile to obtain
The corresponding space encoder of code character;
Step S352, to described present encoding character, corresponding space encoder is extended, and the coding after being expanded is empty
Between;According to the statistical model of described character, the space encoder after described extension is divided, to obtain next character to be encoded
Corresponding space encoder;
Step S353, using next character to be encoded as currently character to be encoded, the binary number after described conversion
All coding finishes middle character, obtains coding result;
Step S354, using described coding result, data to be encoded length and the first statistical parameter as coding output, institute
State the first statistical parameter be comprise in described data to be encoded 1 number.
For example:If the binary number after conversion is 1010000110010101000100010, two after described conversion are entered
Number processed is encoded according to pre-arranged code rule, and coding step is as follows:
Define S and represent assemble of symbol;LSRepresent S set symbol number;The probability that so each symbol occurs entirely is pressed
According toCalculated, the lower limit of L current interval;The interval upper limit of H present encoding;R is present encoding interval size, wherein R=
H-L;Len represents the total length of data to be compressed.RmaxInitial code space is a positive integer, in arithmetic coding is
1.
First, initialize relevant parameter, due to only having 0 and 1 in current character string, so S ∈ { 0,1 }, then LS=
2.Define Rmax=100000000000 it is to be appreciated that RmaxValue can be relatively large, T0=Ls, fk=1, k ∈ [0, Ls)
I.e. f0=1, f1=1, H0=R0=Rmax、L0=0.Set α0=1.1 adopt static coefficient, i.e. α heren=α0.Len=0 (waits to compile
Code data length), Count=0 (the first statistical parameter comprises 1 number in described data to be encoded).Empty to initial code
Between be extended being expanded after space R0=Rmax*α0=110000000000.According to the static statistics model of described character,
Initial code space after extension is divided, obtains
U′0=[0,54999999999], U '1=[55000000000,110000000000].
Then, obtain the 1st character 1 to be encoded.Now the corresponding space encoder of character 1 to be encoded is U '1, treat volume
The corresponding space encoder of code character 1 is U '1It is extended, according to formula
Obtain R1=30250000000;Space encoder after being expanded;According to the statistical model of described character, to described
Space encoder after extension obtains after being divided
U′0=[55000000000,85249999999], U '1=[85250000000,115500000000].
And update statistical value:Count=Count+1, Len=Len+1.
Then obtain the 2nd character 0 to be encoded, calculate R in the same manner2=16637500000, the coding after being expanded is empty
Between;According to the statistical model of described character, after the space encoder after described extension is divided, obtain U '0=
[55000000000,71637499999], U '1=[71637500000,88275000000].And update statistical value:Count=
Count+0 (because now character to be encoded is 0, the therefore value of count is not added with 1), Len=Len+1.
By that analogy, last coding result value V '=730429, work as αn=α0When=1.1, compared with traditional coding result
63118085, few 2 numerical value, improve 25% compression ratio.
If according to repeatedly according to default transformation rule in step S340, being changed and be have recorded conversion times Times,
Then now by V ', Count, Len, Times complete as coding output, now coding.
It is understood that the embodiment of the present invention is lossless coding, after getting described coding output, can carry out inverse
It is decoded to calculating, initial data can be restored.Decoding process is as follows:
The first step:According to described coding output, the inverse operation according to pre-arranged code rule is decoded, after being changed
Binary number:
First:Initialization relevant parameter, due to only having 0 and 1 in current character string, so S ∈ { 0,1 }, then Ls=
2.Define Rmax=100000000000 it is to be appreciated that RmaxValue can be relatively large, T0=Ls, fk=1, k ∈ [0, Ls)
I.e. f0=1, f1=1, H0=R0=Rmax、L0=0.Set α0=1.1 adopt static coefficient, i.e. α heren=α0.Len=0,
Count=0;Due to being α0=1.1, y (n) ≈ 1.So taking y (n)=1, Ty (n)=t or
Obtain Count=9 (the first statistical parameter comprises 1 number in described data to be encoded), Len=25 (waits to compile
Code data length) and coding result V '=730429.
According to formula:
Obtain current solution code space, t=T=55004691494.Coding result V ' and t is compared.Now V ' >
550046, so output 1;Count=Count-1 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Then, draw t=T=85252570554 according to 1100000000000000001111111.V ' < 852525, defeated
Go out 0;;Count=Count-0 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Then, draw t=T=71640070554 according to 1010000000000000001111111.V ' > 716400, defeated
Go out symbol 1;Count=Count-1 (just subtracts 1 when only decoding symbol 1), Len=Len-1.
Now, draw t=T=80789529037 according to 1011000000000000000111111.V ' < 807895, defeated
Go out symbol 0;
By that analogy, using 1010000000000000001111111 and 1010111111100000000000000 continuation
Decoding.Decode as Len=0 and terminate, the binary number obtaining after changing is 1010000110010101000100010.
Second step:To the binary number after described conversion, the inverse operation according to described default transformation rule is changed, and obtains
Binary number after must compressing.If containing conversion times Times in coding output, need to carry out Times inverse operation.
Such as 1001001110001001, after first time inverse conversion, obtain the binary number after the compression of decoding for the first time
1110001011110001, then according to same method is by 1110001011110001 often second inverse operation, obtain pressure
Binary number 1011110010100001 after contracting.
3rd step:Entropy decoding
It is divided into two subintervals by Qe, Pe, judge that the code word being decoded falls interval at which, and give corresponding symbol.
If c '=(0.0101) b is the value being decoded, initial value A=1Qe=0.001
Work as c ' and fall between 0-QeA, solution code sign is D=0, then C '=C ', A=Qe A
Work as c ' and fall between QeA-A, solution code sign is D=1, then C '=C '-Qe A, A=A (1-Qe)
C '=0.0101 falls between Qe A-A, and solution code sign is D=1
C '=c '-QeA=0.0101-0.001=0.0011, A=A (1-Qe)=0.111
C '=0.0011 falls between Qe A-A, and solution code sign is D=1
C '=c '-QeA=0.0011-0.000111=0.000101,
A=A (1-Qe)=0.111 0.111=0.110001
C '=0.000101 falls between 0-QeA, and solution code sign is D=0
C '=c '=0.000101 A=AQe=0.110001 0.001=0.000110001
Wherein, MPS and LPS given above is the division methods under static models, and this situation can directly be passed through to look into
Table draws probable value Qe of LPS, then obtains the probable value of MPS by 1-Qe.Numerical value system can be saved with binary coding
Conversion operand.Its principle is identical.
4th step:According to the default inverse operation acquiring unit matrix eliminating rule
Need after obtaining cell matrix binary numeral to insert the character position being eliminated information by the method for interpolation,
Obtain all of cell matrix successively., the corresponding cell matrix size one of symbol 0 is set to taking the original matrix of 256*256 as a example
256*256 (wherein only comprises 0 and 1).So 1 number is to determine.So the position that we can will appear from 1 is inserted into symbol
On number 1 corresponding cell matrix correspondence position.The cell matrix of symbol 1 is reduced into for 256*256 size.
5th step:Sparse matrix reduces non-sparse matrix
Step 1:Produce the same matrix B of a cell matrix ranks number, B is the matrix of m*n.Produce an empty sequence T.
Step 2:Take cell matrix group C_SQ.
Step 3:For all cell matrix C_SQ[k], works as C_SQ[k]ijWhen=1, Bij=SQ[k] (i ∈ [0, m-1], j
∈[0,n-1],k∈[0,LSQ- 1]), then B=C.
Step 4:By the data convert in matrix B.According to the order of elements of matrix B, take out the element in matrix B, by suitable
Sequence is worn and is listed in sequence T.
Step 5:Repeat step (8), (9), (10), until all of correlation unit matrix disposal finishes, and give up sequence T
Upper LRData (being to mend " 0 ") after element in individual Q, such T=R
6th step:Iterative decoding
If employing successive ignition during coding to process, similarly need here to be iterated decoding.
Data processing method provided in an embodiment of the present invention, pre- by carrying out to the sparse matrix being made up of character to be encoded
Process, become multiple cell matrixs, and regular according to eliminating to unit matrix respectively, be converted to binary number, and
Respectively described binary number is carried out with the binary number after entropy code obtains compression, and the binary number after compression is carried out again
Conversion, is finally encoded according to pre-arranged code rule and is compressed further so that encoding output, and compression ratio becomes big, and institute
The method of stating can be implemented with iteration, can obtain more preferable compression effectiveness, and the inverse operation according to compression algorithm, is capable of no
That damages restores initial data.
Refer to Fig. 6, Fig. 6 is that a kind of functional module of data processing equipment 210 that third embodiment of the invention provides is shown
It is intended to, described data processing equipment 210 includes the first modular converter 211, cancellation module 212, the first coding module 213, second
Modular converter 214 and the second coding module 215.
Described first modular converter 211, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each
Described cell matrix is corresponding with one of described non-sparse matrix symbol respectively, element 1 in each described cell matrix
Position in preset order table for the position symbol corresponding with described cell matrix is corresponding, and described preset order table refers to institute
State the table that in non-sparse matrix, all symbols arrange according to preset order.
Described cancellation module 212, for by each described cell matrix, eliminating after rule process according to default, shape respectively
Become corresponding multiple binary number.
Wherein, described default elimination rule includes for first described cell matrix scanning formation binary system in order
Number, and first described cell matrix is designated as eliminating matrix;Using second described cell matrix as matrix to be canceled;Obtain
Take and eliminate the position that in matrix, element 1 occurs, be designated as eliminating position;According to described elimination position, by described matrix to be canceled
The element of middle corresponding position is deleted;Scan surplus element in described matrix to be canceled, form binary number;By described in second
Cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all cell matrixs are disposed.
Described first coding module 213, for respectively each binary number described being carried out entropy code, forms multiple compressions
Binary number afterwards.
Described second modular converter 214, if include multiple continuous 1 for the binary number after described compression, by institute
State the binary number after compression to be changed according to default transformation rule, obtain the binary number after conversion.
Wherein, described default transformation rule includes:The value of the first label is set to 0, the value of the second label is set to
1;Using first of the binary number after described compression as currently character to be encoded;If described character currently to be encoded is 1,
The corresponding coding of described character to be encoded is output as the value of the second label, and the value by described first label and described second label
Exchange;If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, institute
State the first label constant with the value of described second label;Using next character of described character currently to be encoded as currently to be encoded
Character, changes to described character to be encoded, and the character of the binary number after described compression all converts, and obtains
Binary number after conversion.
Described second coding module 215, for by the binary number after described conversion, being compiled according to pre-arranged code rule
Code, obtains coding output.
Each module can be by software code realization above, and now, above-mentioned each module can be stored in depositing of server 200
In reservoir.Each module equally can be realized by hardware such as IC chip above.
It should be noted that each embodiment in this specification is all described by the way of going forward one by one, each embodiment weight
Point explanation is all difference with other embodiment, between each embodiment identical similar partly mutually referring to.
The data processing equipment that the embodiment of the present invention is provided, it realizes the technique effect of principle and generation and preceding method
Embodiment is identical, and for briefly describing, apparatus and system embodiment part does not refer to part, refers to phase in preceding method embodiment
Answer content.
It should be understood that disclosed apparatus and method are it is also possible to pass through in several embodiments provided herein
Other modes are realized.Device embodiment described above is only schematically, for example, the flow chart in accompanying drawing and block diagram
Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product,
Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code
Part, a part for described module, program segment or code comprises holding of one or more logic function for realizing regulation
Row instruction.It should also be noted that at some as in the implementation replaced, the function of being marked in square frame can also be to be different from
The order being marked in accompanying drawing occurs.For example, two continuous square frames can essentially execute substantially in parallel, and they are sometimes
Can execute in the opposite order, this is depending on involved function.It is also noted that it is every in block diagram and/or flow chart
The combination of the square frame in individual square frame and block diagram and/or flow chart, can be with the special base of the function of execution regulation or action
System in hardware to be realized, or can be realized with combining of computer instruction with specialized hardware.
In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation
Divide or modules individualism is it is also possible to two or more modules are integrated to form an independent part.
If described function realized using in the form of software function module and as independent production marketing or use when, permissible
It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words
Partly being embodied in the form of software product of part that prior art is contributed or this technical scheme, this meter
Calculation machine software product is stored in a storage medium, including some instructions with so that a computer equipment (can be individual
People's computer, server, or network equipment etc.) execution each embodiment methods described of the present invention all or part of step.
And aforesaid storage medium includes:USB flash disk, portable hard drive, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.Need
Illustrate, herein, such as first and second or the like relational terms be used merely to by an entity or operation with
Another entity or operation make a distinction, and not necessarily require or imply there is any this reality between these entities or operation
The relation on border or order.And, term " inclusion ", "comprising" or its any other variant are intended to the bag of nonexcludability
Containing, so that including a series of process of key elements, method, article or equipment not only include those key elements, but also including
Other key elements being not expressly set out, or also include for this process, method, article or the intrinsic key element of equipment.
In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including described key element
Process, method, also there is other identical element in article or equipment.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for the skill of this area
For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exist
Representing similar terms in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, being then not required in subsequent accompanying drawing
It is defined further and to be explained.
The above, the only specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, and any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, all should contain
Cover within protection scope of the present invention.Therefore, protection scope of the present invention should described be defined by scope of the claims.
Claims (10)
1. a kind of data processing method is it is characterised in that methods described includes:
One non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix is non-sparse with described respectively
One of matrix symbol is corresponding, the position of element 1 symbol corresponding with described cell matrix in each described cell matrix
Position in preset order table is corresponding, and described preset order table refers to all symbols in described non-sparse matrix according to pre-
If tactic table;
By each described cell matrix, after default elimination rule process, form corresponding multiple binary number respectively;
Respectively each binary number described is carried out entropy code, form the binary number after multiple compressions;
If the binary number after described compression includes multiple continuous 1, the binary number after described compression is turned according to default
Change rule to be changed, obtain the binary number after conversion;
By the binary number after described conversion, encoded according to pre-arranged code rule, obtained coding output.
2. method according to claim 1 is it is characterised in that the described default rule that eliminates includes:
First described cell matrix is scanned formation binary number in order, and first described cell matrix is designated as
Eliminate matrix;
Using second described cell matrix as matrix to be canceled;Obtain and eliminated the position that in matrix, element 1 occurs, be designated as disappearing
Except position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted;Scan described to be canceled
Surplus element in matrix, forms binary number;
Second described cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all units
Matrix disposal finishes.
3. method according to claim 1 is it is characterised in that described entropy code is arithmetic coding or Interval Coding.
4. method according to claim 1 is it is characterised in that described default transformation rule includes:
The value of the first label is set to 0, the value of the second label is set to 1;
Using first of the binary number after described compression as currently character to be encoded;
If described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as the value of the second label, and will
Described first label is exchanged with the value of described second label;
If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, described
First label is constant with the value of described second label;
Using next character of described character currently to be encoded as currently character to be encoded, described character to be encoded is carried out turn
Change, the character of the binary number after described compression all converts, and obtains the binary number after conversion.
5. method according to claim 4 it is characterised in that described obtain conversion after binary number after, described
Method also includes:
By the binary number after described conversion, repeatedly according to default transformation rule, changed and recorded conversion times.
6. method according to claim 1 is it is characterised in that described pre-arranged code rule includes:
Initial code space is carried out with the space after spatial spread is expanded, according to the word in the binary number after described conversion
The static statistics model of symbol, divides to the initial code space after extension, to obtain the corresponding volume of currently character to be encoded
Code space;
To described present encoding character, corresponding space encoder is extended, the space encoder after being expanded;According to described word
The statistical model of symbol, divides to the space encoder after described extension, and to obtain, next character to be encoded is corresponding to encode sky
Between;
Using next character to be encoded as currently character to be encoded, in the binary number after described conversion, character all encodes
Finish, obtain coding result;
Using described coding result, data to be encoded length and the first statistical parameter as coding output, described first statistics ginseng
Number is for comprising 1 number in described data to be encoded.
7. method according to claim 5 is it is characterised in that the output of described coding also includes the number of times of described conversion.
8. a kind of data processing equipment is it is characterised in that described device includes:
First modular converter, for a non-sparse matrix is converted to multiple cell matrixs, wherein, each described cell matrix
Corresponding with one of described non-sparse matrix symbol respectively, the position of element 1 and described list in each described cell matrix
Position in preset order table for the corresponding symbol of variable matrix is corresponding, and described preset order table refers to described non-sparse matrix
In the table that arranges according to preset order of all symbols;
Cancellation module, for by each described cell matrix, after default elimination rule process, forming corresponding multiple respectively
Binary number;
First coding module, for respectively each binary number described being carried out entropy code, forms the binary system after multiple compressions
Number;
Second modular converter, if including multiple continuous 1 for the binary number after described compression, by two after described compression
System number is changed according to default transformation rule, obtains the binary number after conversion;
Second coding module, for by the binary number after described conversion, being encoded according to pre-arranged code rule, being obtained coding
Output.
9. device according to claim 8 is it is characterised in that the described default rule that eliminates includes:
First described cell matrix is scanned formation binary number in order, and first described cell matrix is designated as
Eliminate matrix;
Using second described cell matrix as matrix to be canceled;Obtain and eliminated the position that in matrix, element 1 occurs, be designated as disappearing
Except position;According to described elimination position, the element of corresponding position in described matrix to be canceled is deleted;Scan described to be canceled
Surplus element in matrix, forms binary number;
Second described cell matrix is designated as eliminating matrix, and continues with next cell matrix, until all units
Matrix disposal finishes.
10. device according to claim 8 is it is characterised in that described default transformation rule includes:
The value of the first label is set to 0, the value of the second label is set to 1;
Using first of the binary number after described compression as currently character to be encoded;
If described character currently to be encoded is 1, the corresponding coding of described character to be encoded is output as the value of the second label, and will
Described first label is exchanged with the value of described second label;
If described character currently to be encoded is 0, the corresponding coding of described character to be encoded is output as the value of the first label, described
First label is constant with the value of described second label;
Using next character of described character currently to be encoded as currently character to be encoded, described character to be encoded is carried out turn
Change, the character of the binary number after described compression all converts, and obtains the binary number after conversion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610701324.7A CN106452451B (en) | 2016-08-22 | 2016-08-22 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610701324.7A CN106452451B (en) | 2016-08-22 | 2016-08-22 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106452451A true CN106452451A (en) | 2017-02-22 |
CN106452451B CN106452451B (en) | 2019-09-13 |
Family
ID=58181396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610701324.7A Expired - Fee Related CN106452451B (en) | 2016-08-22 | 2016-08-22 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106452451B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109474281A (en) * | 2018-09-30 | 2019-03-15 | 湖南瑞利德信息科技有限公司 | Data encoding, coding/decoding method and device |
CN109525249A (en) * | 2018-09-30 | 2019-03-26 | 湖南瑞利德信息科技有限公司 | Coding-decoding method, system, readable storage medium storing program for executing and computer equipment |
WO2023061177A1 (en) * | 2021-10-12 | 2023-04-20 | 深圳智慧林网络科技有限公司 | Multi-data sending method, apparatus and device based on columnar data scanning, and multi-data receiving method, apparatus and device based on columnar data scanning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699646A (en) * | 2013-12-24 | 2014-04-02 | 吕志强 | Tagging reversible compression method for binary data |
CN104753626A (en) * | 2013-12-25 | 2015-07-01 | 华为技术有限公司 | Data compression method, equipment and system |
US20160004715A1 (en) * | 2014-07-02 | 2016-01-07 | International Business Machines Corporation | Minimizing Metadata Representation In A Compressed Storage System |
CN105791832A (en) * | 2016-03-08 | 2016-07-20 | 湖南千年华光软件开发有限公司 | Data coding method, data decoding method, data coding system and data decoding system |
-
2016
- 2016-08-22 CN CN201610701324.7A patent/CN106452451B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699646A (en) * | 2013-12-24 | 2014-04-02 | 吕志强 | Tagging reversible compression method for binary data |
CN104753626A (en) * | 2013-12-25 | 2015-07-01 | 华为技术有限公司 | Data compression method, equipment and system |
US20160004715A1 (en) * | 2014-07-02 | 2016-01-07 | International Business Machines Corporation | Minimizing Metadata Representation In A Compressed Storage System |
CN105791832A (en) * | 2016-03-08 | 2016-07-20 | 湖南千年华光软件开发有限公司 | Data coding method, data decoding method, data coding system and data decoding system |
Non-Patent Citations (2)
Title |
---|
王防修 等: "通过哈弗曼编码实现文件的压缩与解压", 《武汉工业学院学报》 * |
陈鹏 等: "基于约束等距的块稀疏压缩采样匹配追踪算法", 《系统工程与电子技术》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109474281A (en) * | 2018-09-30 | 2019-03-15 | 湖南瑞利德信息科技有限公司 | Data encoding, coding/decoding method and device |
CN109525249A (en) * | 2018-09-30 | 2019-03-26 | 湖南瑞利德信息科技有限公司 | Coding-decoding method, system, readable storage medium storing program for executing and computer equipment |
CN109525249B (en) * | 2018-09-30 | 2023-10-27 | 湖南瑞利德信息科技有限公司 | Encoding and decoding method, system, readable storage medium and computer device |
WO2023061177A1 (en) * | 2021-10-12 | 2023-04-20 | 深圳智慧林网络科技有限公司 | Multi-data sending method, apparatus and device based on columnar data scanning, and multi-data receiving method, apparatus and device based on columnar data scanning |
Also Published As
Publication number | Publication date |
---|---|
CN106452451B (en) | 2019-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2544895B2 (en) | Distributed data processing system | |
US20210274225A1 (en) | Methods and apparatuses for encoding and decoding a bytestream | |
US7921145B2 (en) | Extending a repetition period of a random sequence | |
TW303549B (en) | ||
CN106452451A (en) | Data processing method and device | |
CN110021369B (en) | Gene sequencing data compression and decompression method, system and computer readable medium | |
CN106301385A (en) | The method and apparatus carrying out reasonable compression and decompression for logarithm | |
CN106445890A (en) | Data processing method | |
JPH08167852A (en) | Method and device for compressing data | |
Yang et al. | Universal lossless data compression with side information by using a conditional MPM grammar transform | |
Arming et al. | Data compression in hardware—The Burrows-Wheeler approach | |
CN106484753A (en) | Data processing method | |
JP4758494B2 (en) | Circuit and method for converting bit length to code | |
CN109698703B (en) | Gene sequencing data decompression method, system and computer readable medium | |
CN109831544A (en) | A kind of coding and storing method and system applied to E-mail address | |
CN108829930A (en) | The light weight method of three-dimensional digital technological design MBD model | |
Howard et al. | Parallel lossless image compression using Huffman and arithmetic coding | |
CN110111851B (en) | Gene sequencing data compression method, system and computer readable medium | |
Barbay | From time to space: Fast algorithms that yield small and fast data structures | |
CN105931278A (en) | Methods And Apparatus For Two-dimensional Block Bit-stream Compression And Decompression | |
Park et al. | Encoding weights of irregular sparsity for fixed-to-fixed model compression | |
CN206712982U (en) | A kind of Huffman coded systems for VLSI designs | |
JPH05241775A (en) | Data compression system | |
US11907329B2 (en) | Convolution calculation apparatus and method | |
CN109660262A (en) | A kind of character coding method and system applied to E-mail address |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190913 Termination date: 20200822 |
|
CF01 | Termination of patent right due to non-payment of annual fee |