CN103412864B - A kind of data compression storage method - Google Patents
A kind of data compression storage method Download PDFInfo
- Publication number
- CN103412864B CN103412864B CN201310223387.2A CN201310223387A CN103412864B CN 103412864 B CN103412864 B CN 103412864B CN 201310223387 A CN201310223387 A CN 201310223387A CN 103412864 B CN103412864 B CN 103412864B
- Authority
- CN
- China
- Prior art keywords
- data
- algorithm
- storage
- memory node
- initial value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of data compression storage method, including data genaration algorithm storing step, data definition storage step and data storing steps.The present invention is matched to data according to the algorithm in data markers definition, currency can be compressed after the match is successful, it fails to match then carries out original storage to its value, now, each identical node only needs to storage once just can be with, so cause the space empty that originally will store some duplicate data nodes remaining out, so as to solve the problems, such as that it is huge that memory space takes, reduce hardware deployment quantity, miniaturization deployed environment, save lower deployment cost, the present invention can be realized in real time, frequent, the compression storing data of big data quantity, greatly reduce the problem of memory space occupancy.
Description
Technical field
The present invention relates to a kind of data compression storage method, more particularly to it is a kind of suitable for real-time, frequent, big data quantity
Data compression storage method, belong to technical field of data processing.
Background technology
At present, data storage realizes the storage to each data point, and this mode will take very big memory space.Cause
This, compression storing data is current data storage target to be reached and the inevitable outcome realized.The present invention is according to data mark
Algorithm in note definition is matched to data.Currency can be compressed after the match is successful.It fails to match then enters to its value
Row original storage.Now, each identical node only need to storage once just can be with, so that originally to store some repeat numbers
According to node space empty more than out.The present invention can carry out effective data pressure to the data of real-time, frequent, big data quantity
Contracting, in order to carry out efficient storage to which.So as to solve the problems, such as that it is huge that memory space takes, hardware deployment quantity is reduced,
Miniaturization deployed environment, saves lower deployment cost.
The content of the invention
Present invention solves the technical problem that:Overcome the deficiencies in the prior art, there is provided a kind of data compression storage method, can
Effective data compression is carried out with the data to real-time, frequent, big data quantity, in order to efficient storage be carried out to which.
The present invention technical solution be:A kind of data compression storage method, including:Data genaration algorithm storage step
Suddenly, data definition storage step and data storing steps;
Data algorithm generation step:Data algorithm Z=F (U, V) is defined, U=f (x, y), x are input data set, and y is x
Correlation calculation result value, U is the initial value determined by function f (x, y), and Z is the nodal value that function F (U, V) determines, V is to pass
The real number of increasing or time data mark;
Data definition storage step:Data to gathering carry out Data Identification, judge through Data Identification data M whether
Meet data algorithm definition, if meeting data algorithm and defining data M are entered with line algorithm mark, is not otherwise identified;
Data storing steps:
(1) proceed by data storage;
(2) data N to being input into judge that the algorithm of searching data N identifies whether exist, if algorithm mark is not deposited
Data storage is directly being carried out then;If algorithm mark is present in F (U, V), then last memory node is inquired about;
(3) if last memory node is present, judge whether currency meets the algorithm of memory node;If finally deposited
Storage node is not present, then generate data memory node, indicate algorithm initial value and serial number;
(4) the algorithm F (U, V) and initial value of last memory node if currency meets the algorithm of memory node, are taken out
U, carries out progressive, i.e. set x progressive, the foundation algorithm F that carry out position in U=f (x, y) to input data set position in initial value U
(U, V) is recalculated and is drawn currency S, if currency S and node original value N are unequal, regenerates data storage section
Point, indicates algorithm initial value and serial number;Otherwise data memory node serial number increases;
(5) the serial number V of currently stored node is incremented by, then carries out data storage;
(6) by U=f (x, y) and gather current x values and calculate initial value U, and memory node value is N, algorithm F (U, V), first
The memory node of knowledge value U and Data Identification V, then carries out data storage;
(7) data storage terminates.
Present invention advantage compared with prior art is:The present invention enters to data according to the algorithm in data markers definition
Row matching, can be compressed to currency after the match is successful, and it fails to match then carries out original storage to its value, now, each phase
With node only need to storage once just can be with, so that the space empty that originally will store some duplicate data nodes is remaining out,
So as to solve the problems, such as that it is huge that memory space takes, hardware deployment quantity is reduced, miniaturization deployed environment, saving are deployed to
This, the present invention can realize the real-time, compression storing data of frequent, big data quantity, greatly reduce asking for memory space occupancy
Topic.
Description of the drawings
Fig. 1 is the flow chart of data definition storage step of the present invention;
Fig. 2 is the flow chart of data storing steps of the present invention;
Fig. 3 is the comparison chart of Real-time Collection point and mark point.
Specific embodiment
The present invention will be further described in detail with specific embodiment below in conjunction with the accompanying drawings:
As shown in figure 1, the realization of the present invention includes data genaration algorithm storing step, data definition storage step and data
Storing step;
Data algorithm generation step:Data algorithm Z=F (U, V) is defined, U=f (x, y), x are input data set, and y is x
Correlation calculation result value, U is the initial value determined by function f (x, y), and Z is the nodal value that function F (U, V) determines, V is to pass
The real number of increasing or time data mark;
Data definition storage step:Data to gathering carry out Data Identification, judge through Data Identification data M whether
Meet data algorithm definition, if meeting data algorithm and defining data M are entered with line algorithm mark, is not otherwise identified;
Data storing steps:
(1) proceed by data storage;
(2) data N to being input into judge that the algorithm of searching data N identifies whether exist, if algorithm mark is not deposited
Data storage is directly being carried out then;If algorithm mark is present in F (U, V), then last memory node is inquired about;
(3) if last memory node is present, judge whether currency meets the algorithm of memory node;If finally deposited
Storage node is not present, then generate data memory node, indicate algorithm initial value and serial number;
(4) the algorithm F (U, V) and initial value of last memory node if currency meets the algorithm of memory node, are taken out
U, carries out progressive, i.e. set x progressive, the foundation algorithm F that carry out position in U=f (x, y) to input data set position in initial value U
(U, V) is recalculated and is drawn currency S, if currency S and node original value N are unequal, regenerates data storage section
Point, indicates algorithm initial value and serial number;Otherwise data memory node serial number increases;
(5) the serial number V of currently stored node is incremented by, then carries out data storage;
(6) by U=f (x, y) and gather current x values and calculate initial value U, and memory node value is N, algorithm F (U, V), first
The memory node of knowledge value U and Data Identification V, then carries out data storage;
(7) data storage terminates.
Embodiment:
During long term test, the discovery signal of telecommunication changes over time such rule:Within 0-30 times second with
The incremental of time x meets following (1) function expressions:
Y=sinx+1 x ∈ [0,30] (1)
The incremental of x meets following (2) function expressions over time within 45-50 times second
Y=sinx+2 x ∈ [45,50] (2)
Compression algorithm can be considered as in this case carries out data storage.
Step is as follows:
The first step:Algorithm Z=F (U, V), U=f (x, y) are defined,
Now, above-mentioned (1) expression formula can be converted into:
So:Define algorithm 1:f1(x,y)
So, corresponding to each value in V, a Z value can be produced, constitutes a new set:
Z=sin (0)+1, sin (1)+1, sin (2)+1, sin (3)+1,
sin(29)+1,sin(30)+1}
In the same manner, (2) expression formula can be converted into:
So:Define algorithm 2:f2(x,y)
So, corresponding to each value in V, a Z value can be produced, constitutes a new set:
Z=sin (45)+2, sin (46)+2, sin (47)+2, sin (48)+2,
sin(49)+2,sin(50)+2}
Second step:Mark data points.
Corresponded according to the numerical value of the algorithm of first step definition, set V and set Z, carry out data point markers.For example calculate
Method 1:f1(x, y) then produces 31 data points.
{(0,sin(0)+1),
(1,sin(1)+1),
(2,sin(2)+1),
(3,sin(3)+1),
(29,sin(29)+1),
(30,sin(30)+1)}
Then this 31 points are labeled f1(x,y);
For algorithm 2:f2(x, y) then produces 6 data points
{(45,sin(45)+1),
(46,sin(46)+1),
(47,sin(47)+1),
(48,sin(48)+1),
(49,sin(49)+1),
(50,sin(50)+1)}
Then this 6 data points are marked as f2(x,y).
The data of actual acquisition also have other not random hash points in addition to the point in algorithm 1 and algorithm 2.Then these
Data do not do algorithm tag.
3rd step, gathered data are simultaneously stored.
Example in undertaking, if equipment gather 60 seconds to data as shown in figure 3, blueness "-*-" is to marked f1(x,y)
Point, green "-*-" is to marked f2The point of (x, y), peach point are the points of collection.During the point x=10 for wherein gathering,
Collection y=4.
Data storage:From the beginning of x=0;
1. get the data of first point i.e. (0,1) after, by the labelling at second step midpoint, confirm whether this point is labelling
The point of algorithm, if not the point that marked algorithm, is then stored in normal way, if marked algorithm, by marked
The data point of algorithm is stored.
In this example, this data point is the point that marked algorithm 1.So (0,1) this point will be according to by marked algorithm
Data point stored.The memory node of the storage class of algorithm 1 is begun look for, this data point in this instance does not have
Memory node is found, so generating algorithm 1:f1First memory node of (x, y), and V is initialized as into V={ 0 }, U is initial
Turn to (0)+1 i.e. U=1 of U=sin, now the point of x=0, i.e., (0,1) just save.
2. subsequently, second point is received, is compared according to the step in 1., confirm whether this data point marked
The point (shown in Fig. 3) of algorithm 1.So wanting the memory node of lookup algorithm 1, now node is created when first point is stored
, it is only necessary to refreshing V={ 1 } just can be with.If no algorithm 1:f1The memory node of (x, y), then will be according to the side in 1.
Formula creates the memory node of this algorithm.
3. after storing in the same way, to x=9, in this step, V has refreshed as V={ 9 } each point.
4. in x=10, the point y=4 for now gathering, and the point of labelling is (10, sin (10)+1), so this point is not
Meet storage algorithmic rule, then stored as general point.
5., in x=11, confirm that this point is the data point for being identified with algorithm 1.So the last memory node of lookup algorithm 1
It is the memory node for 1. walking establishment in this example, last time storage is 3. walked the, U=1, V={ 9 } now.But
V is refreshed into after V={ 10 } according to algorithm, calculate functional value [sin (10)+1], and node original value for 11, [sin (11)+
1] }, two values are unequal, so this point directly can not refresh V in the first memory node being stored.Now, create f1(x,y)
New memory node, i.e., second memory node, and initialize V={ 11 }, U=sin (11)+1 now just has two algorithms
1 memory node.
6. the step of in x=12 according to x=11, carry out judging to confirm that this algorithm is the data point for being identified with algorithm 1, connect
The last memory node for getting off to make a look up this node is the memory node for 5. walking establishment, is stored simultaneously in this node
V-value is refreshed as V={ 12 }.
7. each point is stored successively to x=30, and V is V={ 30 } also with refreshing;
8. all no marking algorithm of the data point for being gathered between x=31 to x=44, so being deposited as general point
Storage.
9. the data point for being gathered between x=45 to x=50 is the data point for being identified with algorithm 2.Storage mode according to
With algorithm 1-f1The mode of (x, y) storage mode is similar to.
10. entirely common data point after x=50, carries out the storage of general data.
According to this storage mode:
Algorithm 1:f1(x, y) has two memory nodes, and first memory node is labeled as U=sin (0)+1, V={ 9 }, the
Two memory nodes are U=sin (11)+1, V={ 30 }.
Algorithm 2:f2(x, y) has a memory node, and first storage point is designated U=sin (45)+2, V={ 50 }.
In addition point is all stored as general point.
The non-detailed description of the present invention is known to the skilled person technology.
Claims (1)
1. a kind of data compression storage method, it is characterised in that include:Data genaration algorithm storing step, data definition storage step
Rapid and data storing steps;
Data algorithm generation step:Data algorithm Z=F (U, V), U=f (x, y) are defined, x is input data set, and y is the phase of x
Result of calculation value is closed, U is the initial value determined by function f (x, y), and Z is the nodal value that function F (U, V) determines, V is incremental
Real number or time data mark;
Data definition storage step:Data to gathering carry out Data Identification, judge whether meet through data M of Data Identification
Data algorithm is defined, and if meeting data algorithm and defining data M is entered with line algorithm mark, is not otherwise identified;
Data storing steps:
(1) proceed by data storage;
(2) data N to being input into judge that the algorithm of searching data N identifies whether exist, if algorithm mark is not present,
Data storage is directly carried out then;If algorithm mark is present in F (U, V), then last memory node is inquired about;
(3) if last memory node is present, judge whether currency meets the algorithm of memory node;If last storage section
Point is not present, then generate data memory node, indicate algorithm initial value and serial number;
(4) if currency meets the algorithm of memory node, the algorithm F (U, V) and initial value U of last memory node are taken out, it is right
Input data set position in initial value U carries out progressive, i.e. to carry out position progressive for set x in U=f (x, y), foundation algorithm F (U,
V) recalculate and draw currency S, if currency S and node original value N are unequal, regenerate data memory node,
Indicate algorithm initial value and serial number;Otherwise data memory node serial number increases;
(5) the serial number V of currently stored node is incremented by, then carries out data storage;
(6) initial value U is calculated by the current x values of U=f (x, y) and set, and memory node value is N, algorithm F (U, V), initial value U
With the memory node of Data Identification V, data storage is then carried out;
(7) data storage terminates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310223387.2A CN103412864B (en) | 2013-06-06 | 2013-06-06 | A kind of data compression storage method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310223387.2A CN103412864B (en) | 2013-06-06 | 2013-06-06 | A kind of data compression storage method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103412864A CN103412864A (en) | 2013-11-27 |
CN103412864B true CN103412864B (en) | 2017-04-05 |
Family
ID=49605876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310223387.2A Active CN103412864B (en) | 2013-06-06 | 2013-06-06 | A kind of data compression storage method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103412864B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112260694B (en) * | 2020-09-21 | 2022-01-11 | 广州中望龙腾软件股份有限公司 | Data compression method of simulation file |
CN112783056B (en) * | 2021-01-04 | 2022-09-23 | 潍柴动力股份有限公司 | Data programming method, device and equipment of ECU and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1908932A (en) * | 2005-08-05 | 2007-02-07 | 北京人大金仓信息技术有限公司 | Huge amount of data compacting storage method and implementation apparatus therefor |
CN101882141A (en) * | 2009-05-08 | 2010-11-10 | 北京众志和达信息技术有限公司 | Method and system for implementing repeated data deletion |
CN102831222A (en) * | 2012-08-24 | 2012-12-19 | 华中科技大学 | Differential compression method based on data de-duplication |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8407193B2 (en) * | 2010-01-27 | 2013-03-26 | International Business Machines Corporation | Data deduplication for streaming sequential data storage applications |
-
2013
- 2013-06-06 CN CN201310223387.2A patent/CN103412864B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1908932A (en) * | 2005-08-05 | 2007-02-07 | 北京人大金仓信息技术有限公司 | Huge amount of data compacting storage method and implementation apparatus therefor |
CN101882141A (en) * | 2009-05-08 | 2010-11-10 | 北京众志和达信息技术有限公司 | Method and system for implementing repeated data deletion |
CN102831222A (en) * | 2012-08-24 | 2012-12-19 | 华中科技大学 | Differential compression method based on data de-duplication |
Also Published As
Publication number | Publication date |
---|---|
CN103412864A (en) | 2013-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103873371B (en) | A kind of name route Rapid matching lookup method and device | |
CN111462282B (en) | Scene graph generation method | |
CN104484673B (en) | The Supplementing Data method of real-time stream application of pattern recognition | |
CN104904167B (en) | For lookup of the high-performance based on Hash of packet transaction in communication network | |
CN103412864B (en) | A kind of data compression storage method | |
CN102420771B (en) | Method for increasing concurrent transmission control protocol (TCP) connection speed in high-speed network environment | |
CN108418727A (en) | A kind of method and system of detection network equipment | |
CN105847145A (en) | Important node searching method based on network diameters | |
CN107071800B (en) | A kind of cluster wireless sensor network method of data capture and device | |
WO2023045879A1 (en) | Memory allocation method, memory allocation apparatus, electronic device, and readable storage medium | |
CN103226858B (en) | The processing method and processing device of Bluetooth pairing information | |
Ghesmoune et al. | G-stream: Growing neural gas over data stream | |
CN111340727A (en) | Abnormal flow detection method based on GBR image | |
Ghesmoune et al. | Clustering over data streams based on growing neural gas | |
CN105550208A (en) | Similarity storage design method based on spectral hashing | |
CN115130617B (en) | Detection method for continuous increase of self-adaptive satellite data mode | |
CN107124410A (en) | Network safety situation feature clustering method based on machine deep learning | |
CN114048328B (en) | Knowledge-graph link prediction method and system based on conversion hypothesis and message transmission | |
CN106294348B (en) | For the real-time sort method and device of real-time report data | |
CN104504714B (en) | The detection method of the common obvious object of image | |
CN111860222B (en) | Video behavior recognition method, system, computer device and storage medium based on dense-segmented frame sampling | |
CN110020087B (en) | Distributed PageRank acceleration method based on similarity estimation | |
CN105354243A (en) | Merge clustering-based parallel frequent probability subgraph searching method | |
CN118277843B (en) | Multi-mode network traffic classification method, device and storage medium | |
KR100459767B1 (en) | Incursion detection system using the hybrid neural network and incursion dectection method using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100081 Shenzhou building, South Avenue, Haidian District, Beijing, 402, Zhongguancun Applicant after: Leinas Technology (Beijing) Limited by Share Ltd Address before: 100081 Shenzhou building, South Avenue, Haidian District, Beijing, 402, Zhongguancun Applicant before: China Spacesat Co., Ltd. |
|
COR | Change of bibliographic data | ||
GR01 | Patent grant | ||
GR01 | Patent grant |