CN103412864B

CN103412864B - A kind of data compression storage method

Info

Publication number: CN103412864B
Application number: CN201310223387.2A
Authority: CN
Inventors: 王志恒; 周琛
Original assignee: Leinas Technology (beijing) Ltd By Share Ltd
Current assignee: Leinas Technology (beijing) Ltd By Share Ltd
Priority date: 2013-06-06
Filing date: 2013-06-06
Publication date: 2017-04-05
Anticipated expiration: 2033-06-06
Also published as: CN103412864A

Abstract

The invention discloses a kind of data compression storage method, including data genaration algorithm storing step, data definition storage step and data storing steps.The present invention is matched to data according to the algorithm in data markers definition, currency can be compressed after the match is successful, it fails to match then carries out original storage to its value, now, each identical node only needs to storage once just can be with, so cause the space empty that originally will store some duplicate data nodes remaining out, so as to solve the problems, such as that it is huge that memory space takes, reduce hardware deployment quantity, miniaturization deployed environment, save lower deployment cost, the present invention can be realized in real time, frequent, the compression storing data of big data quantity, greatly reduce the problem of memory space occupancy.

Description

A kind of data compression storage method

Technical field

The present invention relates to a kind of data compression storage method, more particularly to it is a kind of suitable for real-time, frequent, big data quantity Data compression storage method, belong to technical field of data processing.

Background technology

At present, data storage realizes the storage to each data point, and this mode will take very big memory space.Cause This, compression storing data is current data storage target to be reached and the inevitable outcome realized.The present invention is according to data mark Algorithm in note definition is matched to data.Currency can be compressed after the match is successful.It fails to match then enters to its value Row original storage.Now, each identical node only need to storage once just can be with, so that originally to store some repeat numbers According to node space empty more than out.The present invention can carry out effective data pressure to the data of real-time, frequent, big data quantity Contracting, in order to carry out efficient storage to which.So as to solve the problems, such as that it is huge that memory space takes, hardware deployment quantity is reduced, Miniaturization deployed environment, saves lower deployment cost.

The content of the invention

Present invention solves the technical problem that：Overcome the deficiencies in the prior art, there is provided a kind of data compression storage method, can Effective data compression is carried out with the data to real-time, frequent, big data quantity, in order to efficient storage be carried out to which.

The present invention technical solution be：A kind of data compression storage method, including：Data genaration algorithm storage step Suddenly, data definition storage step and data storing steps；

Data algorithm generation step：Data algorithm Z=F (U, V) is defined, U=f (x, y), x are input data set, and y is x Correlation calculation result value, U is the initial value determined by function f (x, y), and Z is the nodal value that function F (U, V) determines, V is to pass The real number of increasing or time data mark；

Data definition storage step：Data to gathering carry out Data Identification, judge through Data Identification data M whether Meet data algorithm definition, if meeting data algorithm and defining data M are entered with line algorithm mark, is not otherwise identified；

Data storing steps：

(1) proceed by data storage；

(2) data N to being input into judge that the algorithm of searching data N identifies whether exist, if algorithm mark is not deposited Data storage is directly being carried out then；If algorithm mark is present in F (U, V), then last memory node is inquired about；

(3) if last memory node is present, judge whether currency meets the algorithm of memory node；If finally deposited Storage node is not present, then generate data memory node, indicate algorithm initial value and serial number；

(4) the algorithm F (U, V) and initial value of last memory node if currency meets the algorithm of memory node, are taken out U, carries out progressive, i.e. set x progressive, the foundation algorithm F that carry out position in U=f (x, y) to input data set position in initial value U (U, V) is recalculated and is drawn currency S, if currency S and node original value N are unequal, regenerates data storage section Point, indicates algorithm initial value and serial number；Otherwise data memory node serial number increases；

(5) the serial number V of currently stored node is incremented by, then carries out data storage；

(6) by U=f (x, y) and gather current x values and calculate initial value U, and memory node value is N, algorithm F (U, V), first The memory node of knowledge value U and Data Identification V, then carries out data storage；

(7) data storage terminates.

Present invention advantage compared with prior art is：The present invention enters to data according to the algorithm in data markers definition Row matching, can be compressed to currency after the match is successful, and it fails to match then carries out original storage to its value, now, each phase With node only need to storage once just can be with, so that the space empty that originally will store some duplicate data nodes is remaining out, So as to solve the problems, such as that it is huge that memory space takes, hardware deployment quantity is reduced, miniaturization deployed environment, saving are deployed to This, the present invention can realize the real-time, compression storing data of frequent, big data quantity, greatly reduce asking for memory space occupancy Topic.

Description of the drawings

Fig. 1 is the flow chart of data definition storage step of the present invention；

Fig. 2 is the flow chart of data storing steps of the present invention；

Fig. 3 is the comparison chart of Real-time Collection point and mark point.

Specific embodiment

The present invention will be further described in detail with specific embodiment below in conjunction with the accompanying drawings：

As shown in figure 1, the realization of the present invention includes data genaration algorithm storing step, data definition storage step and data Storing step；

Data storing steps：

(1) proceed by data storage；

(7) data storage terminates.

Embodiment：

During long term test, the discovery signal of telecommunication changes over time such rule：Within 0-30 times second with The incremental of time x meets following (1) function expressions：

Y=sinx+1 x ∈ [0,30] (1)

The incremental of x meets following (2) function expressions over time within 45-50 times second

Y=sinx+2 x ∈ [45,50] (2)

Compression algorithm can be considered as in this case carries out data storage.

Step is as follows：

The first step：Algorithm Z=F (U, V), U=f (x, y) are defined,

Now, above-mentioned (1) expression formula can be converted into：

So：Define algorithm 1：f₁(x,y)

So, corresponding to each value in V, a Z value can be produced, constitutes a new set:

Z=sin (0)+1, sin (1)+1, sin (2)+1, sin (3)+1,

sin(29)+1,sin(30)+1}

In the same manner, (2) expression formula can be converted into：

So：Define algorithm 2：f₂(x,y)

Z=sin (45)+2, sin (46)+2, sin (47)+2, sin (48)+2,

sin(49)+2,sin(50)+2}

Second step：Mark data points.

Corresponded according to the numerical value of the algorithm of first step definition, set V and set Z, carry out data point markers.For example calculate Method 1:f₁(x, y) then produces 31 data points.

{(0,sin(0)+1),

(1,sin(1)+1),

(2,sin(2)+1),

(3,sin(3)+1),

(29,sin(29)+1),

(30,sin(30)+1)}

Then this 31 points are labeled f₁(x,y)；

For algorithm 2:f₂(x, y) then produces 6 data points

{(45,sin(45)+1),

(46,sin(46)+1),

(47,sin(47)+1),

(48,sin(48)+1),

(49,sin(49)+1),

(50,sin(50)+1)}

Then this 6 data points are marked as f₂(x,y).

The data of actual acquisition also have other not random hash points in addition to the point in algorithm 1 and algorithm 2.Then these Data do not do algorithm tag.

3rd step, gathered data are simultaneously stored.

Example in undertaking, if equipment gather 60 seconds to data as shown in figure 3, blueness "-*-" is to marked f₁(x,y) Point, green "-*-" is to marked f₂The point of (x, y), peach point are the points of collection.During the point x=10 for wherein gathering, Collection y=4.

Data storage：From the beginning of x=0；

1. get the data of first point i.e. (0,1) after, by the labelling at second step midpoint, confirm whether this point is labelling The point of algorithm, if not the point that marked algorithm, is then stored in normal way, if marked algorithm, by marked The data point of algorithm is stored.

In this example, this data point is the point that marked algorithm 1.So (0,1) this point will be according to by marked algorithm Data point stored.The memory node of the storage class of algorithm 1 is begun look for, this data point in this instance does not have Memory node is found, so generating algorithm 1：f₁First memory node of (x, y), and V is initialized as into V={ 0 }, U is initial Turn to (0)+1 i.e. U=1 of U=sin, now the point of x=0, i.e., (0,1) just save.

2. subsequently, second point is received, is compared according to the step in 1., confirm whether this data point marked The point (shown in Fig. 3) of algorithm 1.So wanting the memory node of lookup algorithm 1, now node is created when first point is stored , it is only necessary to refreshing V={ 1 } just can be with.If no algorithm 1：f₁The memory node of (x, y), then will be according to the side in 1. Formula creates the memory node of this algorithm.

3. after storing in the same way, to x=9, in this step, V has refreshed as V={ 9 } each point.

4. in x=10, the point y=4 for now gathering, and the point of labelling is (10, sin (10)+1), so this point is not Meet storage algorithmic rule, then stored as general point.

5., in x=11, confirm that this point is the data point for being identified with algorithm 1.So the last memory node of lookup algorithm 1 It is the memory node for 1. walking establishment in this example, last time storage is 3. walked the, U=1, V={ 9 } now.But V is refreshed into after V={ 10 } according to algorithm, calculate functional value [sin (10)+1], and node original value for 11, [sin (11)+ 1] }, two values are unequal, so this point directly can not refresh V in the first memory node being stored.Now, create f₁(x,y) New memory node, i.e., second memory node, and initialize V={ 11 }, U=sin (11)+1 now just has two algorithms 1 memory node.

6. the step of in x=12 according to x=11, carry out judging to confirm that this algorithm is the data point for being identified with algorithm 1, connect The last memory node for getting off to make a look up this node is the memory node for 5. walking establishment, is stored simultaneously in this node V-value is refreshed as V={ 12 }.

7. each point is stored successively to x=30, and V is V={ 30 } also with refreshing；

8. all no marking algorithm of the data point for being gathered between x=31 to x=44, so being deposited as general point Storage.

9. the data point for being gathered between x=45 to x=50 is the data point for being identified with algorithm 2.Storage mode according to With algorithm 1-f₁The mode of (x, y) storage mode is similar to.

10. entirely common data point after x=50, carries out the storage of general data.

According to this storage mode：

Algorithm 1：f₁(x, y) has two memory nodes, and first memory node is labeled as U=sin (0)+1, V={ 9 }, the Two memory nodes are U=sin (11)+1, V={ 30 }.

Algorithm 2：f₂(x, y) has a memory node, and first storage point is designated U=sin (45)+2, V={ 50 }.

In addition point is all stored as general point.

The non-detailed description of the present invention is known to the skilled person technology.

Claims

1. a kind of data compression storage method, it is characterised in that include：Data genaration algorithm storing step, data definition storage step Rapid and data storing steps；

Data algorithm generation step：Data algorithm Z=F (U, V), U=f (x, y) are defined, x is input data set, and y is the phase of x Result of calculation value is closed, U is the initial value determined by function f (x, y), and Z is the nodal value that function F (U, V) determines, V is incremental Real number or time data mark；

Data definition storage step：Data to gathering carry out Data Identification, judge whether meet through data M of Data Identification Data algorithm is defined, and if meeting data algorithm and defining data M is entered with line algorithm mark, is not otherwise identified；

Data storing steps：

(1) proceed by data storage；

(2) data N to being input into judge that the algorithm of searching data N identifies whether exist, if algorithm mark is not present, Data storage is directly carried out then；If algorithm mark is present in F (U, V), then last memory node is inquired about；

(3) if last memory node is present, judge whether currency meets the algorithm of memory node；If last storage section Point is not present, then generate data memory node, indicate algorithm initial value and serial number；

(4) if currency meets the algorithm of memory node, the algorithm F (U, V) and initial value U of last memory node are taken out, it is right Input data set position in initial value U carries out progressive, i.e. to carry out position progressive for set x in U=f (x, y), foundation algorithm F (U, V) recalculate and draw currency S, if currency S and node original value N are unequal, regenerate data memory node, Indicate algorithm initial value and serial number；Otherwise data memory node serial number increases；

(6) initial value U is calculated by the current x values of U=f (x, y) and set, and memory node value is N, algorithm F (U, V), initial value U With the memory node of Data Identification V, data storage is then carried out；

(7) data storage terminates.