CN109918374A

CN109918374A - The method and terminal device of mass data storage

Info

Publication number: CN109918374A
Application number: CN201910126297.9A
Authority: CN
Inventors: 田森; 常子祯; 安平凯; 黄小浦
Original assignee: Zhongke Hengyun Co Ltd
Current assignee: Zhongke Hengyun Co Ltd
Priority date: 2019-02-20
Filing date: 2019-02-20
Publication date: 2019-06-21

Abstract

The present invention is suitable for technical field of data processing, provides the method and terminal device of a kind of mass data storage, this method comprises: obtaining data to be stored；The data to be stored is carried out to be divided into multiple column datas, the corresponding preset attribute of each column data according to multiple preset attributes；Using the form towards column, the multiple column data is stored, it is possible to reduce memory space improves storage space utilization.

Description

The method and terminal device of mass data storage

Technical field

The invention belongs to technical field of data processing more particularly to the methods and terminal device of a kind of mass data storage.

Background technique

With the rapid development of Internet technology, the various application and service on Internet are run on also with big Amount is emerged in large numbers, and the epoch of big data have arrived.Storage for mass data generallys use and carries out data in rows Storage, i.e., every a line store each attribute of a record, each attribute requires to occupy memory space.However, using existing There is technology to store mass data, since each of record attribute requires to occupy memory space, leads to memory space Utilization rate is lower.

Summary of the invention

In view of this, the embodiment of the invention provides a kind of method of mass data storage and terminal device, it is existing to solve There are storage, reading mass data in technology, since each of record attribute requires to occupy memory space, causes The lower problem of storage space utilization.

The first aspect of the embodiment of the present invention provides a kind of method of mass data storage, comprising:

Obtain data to be stored；

The data to be stored is carried out to be divided into multiple column datas according to multiple preset attributes, each column data is corresponding One preset attribute；

Using the form towards column, the multiple column data is stored.

In one embodiment, the data to be stored is being carried out to be divided into multiple column datas according to multiple preset attributes, After the corresponding preset attribute of each column data, the method also includes:

Determine the corresponding major key column data of major key data in all column datas；

It is arranged according to the major key column data, the column data sequence in addition to the major key column data.

In one embodiment, suitable according to the major key column data, the column data in addition to the major key column data described After sequence arrangement, the method also includes:

All column datas are subjected to subregion by preset rules, the column data after obtaining subregion.

In one embodiment, described that all column datas are subjected to subregion by preset rules, the columns after obtaining subregion According to, comprising:

The major key column data is pressed into preset rules subregion, the major key column data after obtaining subregion；

Column data in addition to the major key column data is subjected to subregion according to the partitioned mode of the major key column data, Column data after obtaining subregion.

In one embodiment, all column datas are subjected to subregion by preset rules described, the column data after obtaining subregion Later, the method also includes:

According to the attribute of the data of business demand, corresponding demand column data is determined；

The concordance list of each subregion Yu the demand column data is established according to business demand, the concordance list includes index Number, subregion label and the corresponding demand column data of the subregion label.

In one embodiment, described to use the form towards column, store the multiple column data, comprising:

Column data after the subregion is compressed, is deposited compressed column data using the form towards column Storage.

In one embodiment, the column data by after the subregion compresses, comprising:

A, by the column data any first data deposit caching in, by each byte in first data according to The secondary byte with the second data stored in the first memory block is compared；

If B, first data and second data carry out first fit and when without identical bytes, described the is determined The initial location information of byte to be compressed is the first information in one data, exports the word being compared in first data Section, and the byte being compared in first data is sequentially stored in first memory block, until first storage The byte number of second data stored in area meets preset byte length；

If without identical bytes or consecutive identical when C, first data and second data carry out repeated matching When byte length is less than the preset byte length, the initial location information of byte to be compressed in first data is determined For the second information, and export the byte being compared in first data；

If consecutive identical byte length is greater than or waits when D, first data carry out repeated matching with second data When the preset byte length, determine that the initial location information of byte to be compressed in first data is third information, And export the location information in first data with identical bytes in second data；

E, according to described the first of second data, the initial location information of the byte to be compressed and output The location information of the byte or identical bytes that are compared in data determines incompressible byte in first data；

F, according to the initial location information and the incompressible word of second data, the byte to be compressed Section determines and carries out compressed data to first data；

G, the method according to step A, B, C, D, E and F compresses data all in column data.

The second aspect of the embodiment of the present invention provides a kind of device of mass data storage, comprising:

Module is obtained, for obtaining data to be stored；

Determining module is divided into multiple column datas for carrying out the data to be stored according to multiple preset attributes, often The corresponding preset attribute of a column data；

Memory module stores the multiple column data for using the form towards column.

The third aspect of the embodiment of the present invention provides a kind of terminal device, comprising: memory, processor and storage In the memory and the computer program that can run on the processor, the processor execute the computer journey It realizes when sequence such as the step of above-mentioned mass data storage the method

The fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, comprising: the computer can It reads storage medium and is stored with computer program, realized when the computer program is executed by processor as above-mentioned mass data is deposited The step of storing up the method.

Existing beneficial effect is the embodiment of the present invention compared with prior art: by by the data to be stored according to Multiple preset attributes carry out being divided into multiple column datas, the corresponding preset attribute of each column data；Using the shape towards column Formula stores the multiple column data, can solve and stores mass data using the prior art, due to each in a record A attribute requires to occupy memory space, leads to the problem that storage space utilization is lower, to improve memory space utilization Rate.

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, embodiment or the prior art will be retouched below Attached drawing needed in stating is briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention one A little embodiments for those of ordinary skill in the art without any creative labor, can also basis These attached drawings obtain other attached drawings.

Fig. 1 is the implementation process schematic diagram of the method for mass data storage provided in an embodiment of the present invention；

Fig. 2 is the implementation process schematic diagram of the method for another mass data storage provided in an embodiment of the present invention；

Fig. 3 is the schematic diagram of the method provided in an embodiment of the present invention that all column datas are carried out to subregion by preset rules；

Fig. 4 is the schematic diagram of the method provided in an embodiment of the present invention compressed to column data；

Fig. 5 is the exemplary diagram of the device of mass data storage provided in an embodiment of the present invention；

Fig. 6 is the schematic diagram of memory module provided in an embodiment of the present invention；

Fig. 7 is the schematic diagram of terminal device provided in an embodiment of the present invention.

Specific embodiment

In being described below, for illustration and not for limitation, such as particular system structure, technology etc are proposed Detail, to understand thoroughly the embodiment of the present invention.However, it will be clear to one skilled in the art that in these no tools The present invention also may be implemented in the other embodiments of body details.In other situations, omit to well-known system, device, The detailed description of circuit and method, in case unnecessary details interferes description of the invention.

In order to illustrate technical solutions according to the invention, the following is a description of specific embodiments.

Fig. 1 is the implementation process schematic diagram of the method for mass data storage provided in an embodiment of the present invention, and details are as follows.

Step 101, data to be stored is obtained.

Optionally, data to be stored can be the big data of the storage in need on internet, such as the commodity of electric business Sales data, essential information data of user etc..

Step 102, the data to be stored is carried out being divided into multiple column datas, Mei Gelie according to multiple preset attributes Data correspond to a preset attribute.

Optionally, it is described in the present embodiment with user basic information data instance, user as shown in Table 1 believes substantially Cease data.

One user basic information data of table

By above-mentioned user basic information data according to Attribute transposition, the corresponding column data of each attribute is determined are as follows: attribute " surname The corresponding column data of name " is respectively as follows: Zhang, Lee, Mr. Wang, Guo etc.；The corresponding column data of attribute " gender " be respectively as follows: female, Male, male, female etc.；The corresponding column data of attribute " age " is respectively as follows: 25,29,38,52 etc.；The corresponding column data of attribute " weight " It is respectively as follows: 95,158,165,113 etc.；The corresponding column data of attribute " height " is respectively as follows: 163,182,176,165 etc.；Attribute " phone number " corresponding column data is respectively as follows: 185****2256,137****8477,159****5466,177****4678 Deng.Each column data of user basic information as shown in Table 2.

Each column data of two user basic information of table

Optionally, as shown in Fig. 2, being divided into the data to be stored according to multiple preset attributes in step 102 After multiple column datas, the method for above-mentioned mass data storage can also be including step 104 to step 105.

Step 104, the corresponding major key column data of major key data in all column datas is determined.

Major key data in this step are one or more fields, and major key data are a certain in table for uniquely identifying Item record, such as the first row data in above-mentioned user basic information data: Zhang, female, 25,95,163,185****2256 etc., It can be using name or phone number as major key data.It is described in the present embodiment using phone number as major key data, Then the corresponding column data of phone number is major key column data, such as major key column data can be with are as follows: 185****2256, 137****8477,159****5466,177****4678 etc..

Step 105, it is arranged according to the major key column data, the column data sequence in addition to the major key column data.

Optionally, the column data sequence of step 105 can be with are as follows: the corresponding column data of column 6, i.e. phone number；Column 1, i.e. surname The corresponding column data of name；The corresponding column data of column 2, i.e. gender；The corresponding column data of column 3, i.e. age；Column 4, i.e. weight are corresponding Column data；Corresponding column data of column 5, i.e. height etc..

Optionally, as shown in Fig. 2, after step 105, the method for above-mentioned mass data storage can also include step 106。

Step 106, all column datas are subjected to subregion by preset rules, the column data after obtaining subregion.

As shown in figure 3, step 106 can specifically include step 301 and step 302.

Step 301, the major key column data is pressed into preset rules subregion, the major key column data after obtaining subregion.

Optionally, preset rules can be configured according to the needs of users in this step.It optionally, can be according to master Key initial carries out subregion, for example, it is subregion 1 that major key initial, which is a~g, major key initial is that h~o is subregion 2, successively Analogize.Optionally, subregion can be carried out according to the Digital size of major key, for example, when major key is that the number such as telephone number is constituted, First digit is arranged by Digital size in major key, when first digit is all identical, according to the big float of second digit Column, and so on.Optionally, in major key in second digit 1~3 be subregion Isosorbide-5-Nitrae~6 be subregion 2,7~9 be subregion 3, then In the present embodiment, 137****8477 is in subregion 1, and 159****5466 is in subregion 2,177****4678 and 185**** 2256 in subregion 3.

Step 302, by the column data in addition to the major key column data according to the major key column data partitioned mode into Row subregion, the column data after obtaining subregion.

Optionally, the column data in step 302 after subregion can be with are as follows:

Optionally, subregion is carried out to column data to allow to when inquiring data, only inquire the number of current partition According to speed is faster, more efficient.

Optionally, after carrying out subregion to column data, it can establish corresponding concordance list, in order to data query, specifically , after step 106, the method for above-mentioned mass data storage can also include step A to step B.

Step A determines corresponding demand column data according to the attribute of the data of business demand.

Optionally, if business needs to inquire age distribution, can determine corresponding demand column data be index column, Phone number column and age column, or can determine that corresponding demand column data is index column, age column.

Step B establishes the concordance list of each subregion Yu the demand column data, the concordance list packet according to business demand Include call number, subregion label and the corresponding demand column data of the subregion label.

Optionally, the concordance list established in step B can be following table:

Optionally, then establishing concordance list can inquire according to subregion label and correspond to according to the corresponding subregion of search index The data of business demand, compared to direct inquiry business demand data when speed faster, it is more efficient.

Further, if mass data directly stores, required memory space is larger, therefore can be by institute in step 103 After column data after stating subregion is compressed, then the form towards column is used to store compressed column data.

In one embodiment, as shown in figure 4, the column data by after the subregion compresses, it may include step 401 to 407.

Step 401, any first data in the column data are stored in caching, it will be each in first data Byte is successively compared with the second data stored in the first memory block.

Optionally, the first data are any one data in column data, for example, the first data can be " column 6 " in table two Shown in any of phone number, certain first data can also be other data.In order to be described in detail to column data The step of being compressed, optionally, in the present embodiment using the first data as other column datas in user basic information tables of data For be described, such as attribute is any of the column data data of user's physical examination report number, can be with First data compression process is described for " 12345678987123456789 ".

Optionally, it is described so that the first memory block is the memory block Hash as an example in the present embodiment, the first memory block exists It is empty, i.e., no storing data before this step.

Step 402, it if first data and second data carry out first fit and when without identical bytes, determines The initial location information of byte to be compressed is the first information in first data, exports and is compared in first data Compared with byte, and the byte being compared in first data is sequentially stored in first memory block, until described the The byte number of second data stored in one memory block meets preset byte length.

Optionally, when being compressed to the first data, each byte in the first data is successively analyzed, first First character section " 1 " in first data is analyzed, " 1 " is compared with the second data, the second number when due to starting According to being not present, so the not identical byte with " 1 ", it is thus determined that in first data byte to be compressed initial position It is set to sky, i.e., the first information is sky in this step, is exported " 1 ", and " 1 " is stored in the memory block hash, as the second data First character section.

Then second byte in the first data is analyzed, second byte is " 2 ", by " 2 " and the second data " 1 " is compared, and the two is different, it is thus determined that the initial position of byte to be compressed is sky, output in first data " 2 ", and " 2 " are stored in the memory block hash, as second byte of the second data, the second data are " 12 " at this time, according to The secondary third byte in the first data to the 8th byte is analyzed, as shown in Table 3.It, can be with for the ease of storage The maximum byte length of the memory block hash is rule of thumb set, such as the maximum byte length of the setting memory block hash is 8bite, then obtaining the second data is " 12345678 ".

Table three is to the first data compression process

Step 403, if without identical bytes, Huo Zhelian when first data and second data carry out repeated matching When continuous identical bytes length is less than the preset byte length, the initial position of byte to be compressed in first data is determined Confidence breath is the second information, and exports the byte being compared in first data.

Further, as shown in Table 3, the 9th byte in the first data is analyzed, the 9th byte is " 9 " are compared with each byte of the second data, are all different, and determine the lead-in of byte to be compressed in first data Female location information is the second information, and the second information can be 0 herein, and " 0 " indicates the byte that repeated matching is carried out in the first data In the initial " 1 " of byte to be compressed be located at the 0th, export " 9 ".

Optionally, in this step preset byte length be the second data maximum byte length, in the present embodiment, in advance If byte length can be 8bite.

Further, as shown in Table 3, the tenth byte in the first data is analyzed, the tenth byte is " 8 " are compared with each byte of the second data, identical as the 7th " 8 ", continue to divide the 11st byte " 7 " Analysis, is compared, no identical bytes with each byte of the second data, therefore the first data and the second data carry out repetition The identical bytes length matched is less than preset byte length, it is thus determined that in first data byte to be compressed initial position Confidence breath is 0, indicates that the initial " 1 " that byte to be compressed in the byte of repeated matching is carried out in the first data is located at the 0th, Export the tenth byte " 8 " in the first data.

Continue to analyze the 11st byte in the first data, the 11st byte is " 7 ", with the second data Each byte be compared, it is identical as sextet in the second data, continue to the 12nd byte in the first data " 1 " is analyzed, and is compared with each byte of the second data, identical as zero bytes in the second data, but with the tenth The position of one byte and identical bytes in the second data is discontinuous, therefore the first data and the second data carry out repeated matching Consecutive identical byte length be less than preset byte length, it is thus determined that in first data byte to be compressed initial Location information is " 0 ", indicates that the initial " 1 " that byte to be compressed in the byte of repeated matching is carried out in the first data is located at the 0th Position exports the 11st byte " 7 " in first data.

Step 404, if first data and consecutive identical byte length when second data progress repeated matching are big In or when being equal to the preset byte length, determine that the initial location information of byte to be compressed in first data is the Three information, and export the location information in first data with identical bytes in second data.

Further, as shown in Table 3, continue to analyze the 12nd byte in the first data, the 12nd Byte is " 1 ", identical as first character section " 1 " of the second data, continues to divide the 13rd byte in the first data Analysis, the 13rd byte is " 2 ", identical as second byte " 2 " of the second data, until analyzing the tenth in the first data Nine bytes " 8 " and the 8th byte " 8 " in the second data are also identical, thus first data and second data into The consecutive identical byte length of row repeated matching is equal to preset byte length, determines first data and second data The location information of the identical bytes of repeated matching is " 0 to 7 ", that is, indicates in the first data that zero-bit to septet can be with Compression determines that the initial location information of byte to be compressed in first data is third information, and third information can be with herein For " 0 ".

Further, according to the above-mentioned method to the 12nd byte analysis in the first data to the tenth triplet It is analyzed, available following result: determining first data and the matched identical bytes of the second Data duplication Location information is " 11 to 18 ", that is, indicates that the 11st can compress to the tenth eight bit byte in the first data, determines described the The initial location information of one data and the matched identical bytes of the second Data duplication is " 0 " and " 11 ", indicates first The initial " 1 " that the byte of repeated matching is carried out in data is located at the word that repeated matching is carried out in the 0th and the first data Another initial " 1 " of section is located at the 11st.The initial location information for determining byte to be compressed in first data is the Three information, third information can be " 0 " and " 11 " herein.

According to the remaining word in the first data of various situation analysis of each byte in above-mentioned the first data of analysis Section, until the first data are completed in analysis.

Step 405, according to the institute of second data, the initial location information of the byte to be compressed and output The location information for stating the byte or identical bytes that are compared in the first data determines incompressible in first data Byte.

Optionally, for example, the 9th byte " 9 " in the first data, at this point, the second data are " 12345678 ", it is to be compressed The initial position of byte is " 0 ", and zero bytes are " 1 " in the first data, and the two is different, according to the first data of output In the byte that is compared be " 9 ", therefore " 9 " are incompressible byte.

For another example, the 12nd byte " 1 " in the first data, at this point, the second data are " 12345678 ", byte to be compressed Initial position be " 0 ", zero bytes are " 1 " in the first data, and the two is identical, according to the position of the identical bytes of output Confidence breath for<0,7>, it is thus determined that " 1 " is compressible byte.

Incompressible byte can be " 9,8,7,9 " in the first data determined in this step.

Step 406, according to second data, the byte to be compressed initial location information and it is described can not Packed byte determines and carries out compressed data to first data.

Optionally, it after the compression method described according to step 401 to step 405 compresses the first data, obtains Compressed data are " 12345678,0,11,9,8,7,9 ".It should be noted that the representation of compressed data can be with It is configured according to demand, such as can be in the initial location information and incompressible word of the second data, byte to be compressed Add decollator between section, as " 12345678,0,11,9,8,7,9 ", it can also be in the second data, the initial of byte to be compressed Decollator is not added between location information and incompressible byte, such as " 123456780119879 ".

Step 407, the method according to step 401 to step 406 compresses data all in column data.

The memory space of column data can be reduced by compressing to all column datas, and then improves disk utilization.

Further, it as shown in Fig. 1 or Fig. 2, is divided to by the data to be stored according to multiple preset attributes For multiple column datas carry out storage be specifically as follows step 103.

Step 103, using the form towards column, the multiple column data is stored.

Optionally, the column data of acquisition is stored in the present embodiment in the form of towards column, is sky for train value Column, can not store, i.e., train value is that empty column are not take up memory space, and using the side stored towards capable form Formula is compared, though due to train value be sky be also required to store, so the present embodiment carried out using towards column in the form of store can To reduce storage time, memory space is reduced, it is also higher to read data age rate.

Optionally, this step may include: to compress the column data after the subregion, using the form towards column Compressed multiple column datas are stored, memory space can be further reduced, improve storage space utilization.

The method of above-mentioned mass data storage, by data to be stored according to Attribute transposition, by the data to be stored It carries out being divided into multiple column datas according to multiple preset attributes, determining multiple columns is then stored in the form of towards column According to can reduce storage time, reduce memory space, it is also higher to read data age rate.

It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.

Corresponding to the method for mass data storage described in foregoing embodiments, Fig. 5 shows provided in an embodiment of the present invention The exemplary diagram of the device of mass data storage.As shown in figure 5, the apparatus may include: acquisition module 501, determining module 502, Memory module 503.

Module 501 is obtained, for obtaining data to be stored.

Determining module 502 is divided into multiple columns for carrying out the data to be stored according to multiple preset attributes According to the corresponding preset attribute of each column data.

It is further alternative, the data to be stored is divided into according to multiple preset attributes in determining module 502 After multiple column datas, the determining module 502 can be also used for determining the corresponding major key of major key data in all column datas Column data；Then it is arranged according to the major key column data, the column data sequence in addition to the major key column data.The acquisition Module 501 is also used to all column datas carrying out subregion by preset rules, the column data after obtaining subregion.

Optionally, it when obtaining the column data after module 501 obtains subregion, specifically includes: by the major key column data by pre- If regular subregion, the major key column data after obtaining subregion；And by the column data in addition to the major key column data according to described The partitioned mode of major key column data carries out subregion, the column data after obtaining subregion.

Optionally, after the column data after the acquisition module 501 acquisition subregion, the determining module 502 is also used to According to the attribute of the data of business demand, corresponding demand column data is determined；And according to business demand establish each subregion with The concordance list of the demand column data, the concordance list include that call number, subregion label and the subregion label are corresponding described Demand column data.

Memory module 503 stores the multiple column data for using the form towards column.

Optionally, memory module 503 specifically can be used for compressing the column data after the subregion, using towards The form of column stores compressed column data.

It is further alternative, as shown in fig. 6, memory module 503 may include comparison submodule 601, determine submodule 602, the comparison submodule 601 and the determining submodule 602 are for compressing the column data after the subregion.

Submodule 601 is compared, used in caching any first data deposit in the column data, by described first Each byte is successively compared with the byte of the second data stored in the first memory block in data；

Submodule 602 is determined, if carrying out first fit for first data and second data and without same word When section, determines that the initial location information of byte to be compressed in first data is the first information, export first data In the byte that is compared, and the byte being compared in first data is sequentially stored in first memory block, directly The byte number of second data stored in first memory block meets preset byte length；

The determining submodule 602, if be also used to when first data and second data carry out repeated matching without When identical bytes or consecutive identical byte length are less than the preset byte length, determine to be compressed in first data The initial location information of byte is the second information, and exports the byte being compared in first data；

The determining submodule 602, if being also used to company when first data and second data progress repeated matching When continuous identical bytes length is greater than or equal to the preset byte length, the head of byte to be compressed in first data is determined Character position information is third information, and exports in first data and believe with the position of identical bytes in second data Breath；

The determining submodule 602 is also used to the initial position according to second data, the byte to be compressed The location information of the byte or identical bytes that are compared in information and first data of output determines described Incompressible byte in one data；

The determining submodule 602 is also used to the initial position according to second data, the byte to be compressed Information and the incompressible byte determine and carry out compressed data to first data；

And according to the comparison submodule 601 and determine that submodule 602 compresses data all in column data.

The device of above-mentioned mass data storage determines multiple data to be stored according to Attribute transposition by determining module The corresponding multiple column datas of attribute, then memory module stores determining multiple column datas in the form of towards column, can drop Low storage time reduces memory space, and it is also higher to read data age rate.

Fig. 7 is the schematic diagram for the terminal device that one embodiment of the invention provides.As shown in fig. 7, the terminal of the embodiment is set Standby 7 include: processor 701, memory 702 and are stored in the memory 702 and can transport on the processor 701 Capable computer program 703, such as the program of mass data storage.The processor 701 executes the computer program 703 Step in the embodiment of the method for the above-mentioned mass data storage of Shi Shixian, such as step 101 shown in FIG. 1 is to 103 or Fig. 2 For shown step 101 to step 106, the processor 701 realizes that above-mentioned each device is real when executing the computer program 703 The function of each module in example is applied, such as the function of module 501 to 503 shown in Fig. 5.

Illustratively, the computer program 703 can be divided into one or more program modules, it is one or The multiple program modules of person are stored in the memory 702, and are executed by the processor 701, to complete the present invention.Institute Stating one or more program modules can be the series of computation machine program instruction section that can complete specific function, the instruction segment For describing implementation procedure of the computer program 703 in the device or terminal device 7 of the mass data storage. For example, the computer program 703, which can be divided into, obtains module 501, determining module 502, memory module 503, each module Concrete function is as shown in figure 5, this is no longer going to repeat them.

The terminal device 7 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set It is standby.The terminal device may include, but be not limited only to, processor 701, memory 702.It will be understood by those skilled in the art that Fig. 7 is only the example of terminal device 7, does not constitute the restriction to terminal device 7, may include more more or less than illustrating Component, perhaps combine certain components or different components, such as the terminal device can also be set including input and output Standby, network access equipment, bus etc..

Alleged processor 701 can be central processing unit (Central Processing Unit, CPU), can be with It is other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic device Part, discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processing Device etc..

The memory 702 can be the internal storage unit of the terminal device 7, for example, terminal device 7 hard disk or Memory.The memory 702 is also possible to the External memory equipment of the terminal device 7, such as matches on the terminal device 7 Standby plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) Card, flash card (Flash Card) etc..Further, the memory 702 can also be both interior including the terminal device 7 Portion's storage unit also includes External memory equipment.The memory 702 is for storing the computer program and the terminal Other programs and data needed for equipment 7.The memory 702, which can be also used for temporarily storing, have been exported or will The data of output.

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by difference Functional unit, module complete, i.e., the internal structure of described device is divided into different functional unit or module, with complete All or part of function described above.Each functional unit in embodiment, module can integrate in a processing unit In, it is also possible to each unit and physically exists alone, can also be integrated in one unit with two or more units, on It states integrated unit both and can take the form of hardware realization, can also realize in the form of software functional units.In addition, Each functional unit, module specific name be also only for convenience of distinguishing each other, the protection model being not intended to limit this application It encloses.The specific work process of unit in above system, module, can refer to corresponding processes in the foregoing method embodiment, This is repeated no more.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.

Those of ordinary skill in the art may be aware that described in conjunction with the examples disclosed in the embodiments of the present disclosure Unit and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions It is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Professional technique Personnel can use different methods to achieve the described function each specific application, but this realization should not be recognized It is beyond the scope of this invention.

In embodiment provided by the present invention, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, The division of the module or unit, only a kind of logical function partition, there may be another division manner in actual implementation, Such as multiple units or components can be combined or can be integrated into another system, or some features can be ignored, or not hold Row.Another point, shown or discussed mutual coupling or direct-coupling or communication connection can be to be connect by some Mouthful, the INDIRECT COUPLING or communication connection of device or unit can be electrical property, mechanical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, as unit The component of display may or may not be physical unit, it can and it is in one place, or may be distributed over more In a network unit.Some or all of unit therein can be selected to realize this embodiment scheme according to the actual needs Purpose.

It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated module/unit is realized in the form of SFU software functional unit and sells as independent product Or it in use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned All or part of the process in embodiment method can also instruct relevant hardware to complete by computer program, described Computer program can be stored in a computer readable storage medium, which, can be real when being executed by processor The step of existing above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, the computer Program code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer Readable medium may include: any entity or device, recording medium, USB flash disk, the shifting that can carry the computer program code Dynamic hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate , content that the computer-readable medium includes can according to make laws in jurisdiction and the requirement of patent practice into Row increase and decrease appropriate, such as do not include electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions Carrier signal and telecommunication signal.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features；And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims

1. a kind of method of mass data storage characterized by comprising

Obtain data to be stored；

The data to be stored is carried out to be divided into multiple column datas according to multiple preset attributes, each column data corresponding one pre- If attribute；

Using the form towards column, the multiple column data is stored.

2. the method for mass data storage as described in claim 1, which is characterized in that press the data to be stored described It carries out being divided into multiple column datas according to multiple preset attributes, after each column data corresponds to a preset attribute, the method is also Include:

3. the method for mass data storage as claimed in claim 2, which is characterized in that described according to the major key columns After being arranged according to, column data in addition to major key column data sequence, the method also includes:

4. the method for mass data storage as claimed in claim 3, which is characterized in that described to press all column datas in advance If rule carries out subregion, the column data after obtaining subregion, comprising:

Column data in addition to the major key column data is subjected to subregion according to the partitioned mode of the major key column data, is divided Column data behind area.

5. the method for mass data storage as claimed in claim 4, which is characterized in that preset in described press all column datas Rule carries out subregion, after the column data after obtaining subregion, the method also includes:

The concordance list of each subregion Yu the demand column data is established according to business demand, the concordance list includes call number, divides Area's label and the corresponding demand column data of the subregion label.

6. the method for mass data storage as claimed in claim 3, which is characterized in that it is described to use the form towards column, it deposits Store up the multiple column data, comprising:

Column data after the subregion is compressed, is deposited compressed multiple column datas using the form towards column Storage.

7. the method for mass data storage as claimed in claim 6, which is characterized in that the column data by after the subregion It is compressed, comprising:

A, by any first data deposit caching in the column data, by each byte in first data successively with the The byte of the second data stored in one memory block is compared；

If B, first data and second data carry out first fit and when without identical bytes, first data are determined In the initial location information of byte to be compressed be the first information, export the byte being compared in first data, and will The byte being compared in first data is sequentially stored in first memory block, until storing in first memory block The byte numbers of second data meet preset byte length；

If without identical bytes or consecutive identical byte long when C, first data and second data carry out repeated matching When degree is less than the preset byte length, determine that the initial location information of byte to be compressed in first data is the second letter Breath, and export the byte being compared in first data；

If consecutive identical byte length is greater than or equal to institute when D, first data carry out repeated matching with second data When stating preset byte length, determine that the initial location information of byte to be compressed in first data is third information, and defeated Location information in first data with identical bytes in second data out；

E, according in second data, the initial location information of the byte to be compressed and first data of output The location information of the byte or identical bytes that are compared determines incompressible byte in first data；

F, according to second data, the initial location information of the byte to be compressed and the incompressible byte, really It is fixed that compressed data are carried out to first data；

8. a kind of device of mass data storage characterized by comprising

Module is obtained, for obtaining data to be stored；

Determining module is divided into multiple column datas, Mei Gelie for carrying out the data to be stored according to multiple preset attributes Data correspond to a preset attribute；

9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program The step of any one the method.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In when the computer program is executed by processor the step of any one of such as claim 1 to 7 of realization the method.