CN104699815A - Data processing method and system - Google Patents

Data processing method and system Download PDF

Info

Publication number
CN104699815A
CN104699815A CN201510131621.8A CN201510131621A CN104699815A CN 104699815 A CN104699815 A CN 104699815A CN 201510131621 A CN201510131621 A CN 201510131621A CN 104699815 A CN104699815 A CN 104699815A
Authority
CN
China
Prior art keywords
data
files
blocks
file
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510131621.8A
Other languages
Chinese (zh)
Inventor
董旭
冯海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN201510131621.8A priority Critical patent/CN104699815A/en
Publication of CN104699815A publication Critical patent/CN104699815A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1737Details of further file system functions for reducing power consumption or coping with limited storage space, e.g. in mobile devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to data processing methods and device systems, in particular to a data write-in method and system and a data reading method and system. The data write-in method includes that data of predetermined rows are read so as to generate corresponding file blocks; a header file is established for each file block, the header files comprise index of each file block in a database and index of the files that are stored in each file block, and each file block and the corresponding header file form a data block; the data blocks are written in storage nodes. By means of the methods and the systems, efficient compression of data can be performed, storage costs are lowered, the storage space is saved, and data query and analysis speed is increased.

Description

Data processing method and system
Technical field
Embodiment of the present disclosure relates generally to field of data storage, for processing data, and the particularly system of data write and read method and correspondence thereof.
Background technology
For the mechanisms such as enterprise, tissue, office, in daily middle a large amount of data that can produce, and these data grow with each passing day along with the passing on date, and data volume will become abnormal huge.These large data are that business development, statistical study, policy making etc. provide valuable raw data.But, along with the continuous increase of the data volume gathered or collect, store the system load capacity sustainable growth of these data, the storage architecture for data constantly proposes higher requirement, how to improve storage capacity and the search efficiency of mass data, become one of difficult problem solving large data problem.
Traditional data storage is that line stores and column stores.As shown in Fig. 1 (a) He Fig. 1 (b), respectively illustrate the schematic diagram that line stores and column stores, wherein data are stored in multiple different memory node (i.e. distributed storage) by the mode of row or column.But all there is different problems in these two kinds of storage modes.
For line stores, the data query (or analysis) of current main-stream is all based on column, and when utilizing per-column data enquire method to inquire about the database of line storage, efficiency data query aspect also exists major defect.Specifically, such as, in a database, include multiple field such as " ID ", " name ", " count ", " year ", when needing to utilize Structured Query Language (SQL) (SQL) to inquire about the data in database, such as, " SELECT name FROM order WHERE year=2014 ", the database stored due to line can only read line by line, therefore when inquiring about, need the every data line in database to read out, then respectively the data meeting query statement condition are extracted, cause inquiry velocity slow.In addition, because when looking into in a few row situation, unnecessary row cannot be skipped and read; Due to the row mixing different pieces of information value, row stores the ratio of compression that not easily acquisition one is high, and namely space availability ratio is lower.
For column stores, although can the low problem of search efficiency that stores of line, but the different data rows of the data that column stores is stored in different memory nodes, therefore the data that acquisition one is complete are wanted, tuple data reconfiguration cost is large, cause network overhead excessive, even if the demand that distributed file system inquires about large data cannot be met.Such as, the data of the different lines of same data cell are stored on different memory nodes, so then needing to read multiple node repeatedly to obtain same data cell, to reconstruct related data, causing network overhead to increase.When reading mass data, this network overhead can have a strong impact on inquiry velocity.
Summary of the invention
One of object of the present disclosure is to provide a kind of method for writing data and system, to solve or to alleviate above-mentioned one or more problem of the prior art.
One of object of the present disclosure is also to provide a kind of method for reading data and system, to solve or to alleviate above-mentioned one or more problem of the prior art.
According to first aspect of the present disclosure, a kind of method for writing data is provided, comprises: the data reading predetermined row, to generate corresponding blocks of files; For each blocks of files creates header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block; And described data block is written to memory node.
According to an embodiment of the present disclosure, the data be created in described respective file block can read by row or by row.
According to an embodiment of the present disclosure, described data block is written to memory node and comprises: described data block is written to memory node by column or row.
According to an embodiment of the present disclosure, described data block is written to memory node and comprises: before being written to described memory node, the data corresponding to the data in described data block are compressed.
According to an embodiment of the present disclosure, described data block is written to memory node and comprises: after the size of described data block reaches predetermined threshold, described data block is write multiple memory node in turn.
According to second aspect of the present disclosure, a kind of data writing systems is provided, comprises: blocks of files generating apparatus, be configured to the data reading predetermined row, to generate corresponding blocks of files; Header file generating apparatus, be configured as each blocks of files and create header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block; And writing station, be configured to described data block to be written to memory node.
According to an embodiment of the present disclosure, the data in described blocks of files are being stored by row or by arranging the mode of carrying out reading.
According to an embodiment of the present disclosure, said write device is configured to described data block to be written to memory node by column or row.
According to an embodiment of the present disclosure, said write device comprises compression unit, and described compression unit is configured to: before described data block is written to described memory node, and the data corresponding to the data in described data block are compressed.
According to an embodiment of the present disclosure, said write device is configured to further: after the size of described data block reaches predetermined threshold, and described data block is write multiple memory node in turn.
According to method for writing data and the system of embodiment of the present disclosure, read due to data line and generate corresponding blocks of files, guarantee that the data of a line of same data record unit are positioned at same node, therefore data analysis has the high adaptive faculty of rapid data loading and dynamic load, in addition, the expense of tuple reconstruct is very low.Because blocks of files (or database) is by row write memory node, therefore the column data in raw data can be compressed according to pre-defined algorithm, effectively reduce storage space.
According to the third aspect of the present disclosure, a kind of method for reading data is provided, comprises: the query statement of analytic structure query language, with generated query task; Read head file, and the position data obtaining the file relevant to described query task from described header file; And based on described position data, from described blocks of files, extract data; Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
According to an embodiment of the present disclosure, based on described position data, from described blocks of files, extract data comprise: according to described query task, the data be stored in described blocks of files by row or by row reading.
According to fourth aspect of the present disclosure, a kind of data reading system is provided, comprises: task generating device, be configured to analytic structure query language query statement with generated query task; Position acquisition device, is configured to read head file and from described header file, obtains the position data of the file relevant to described query task; And data extraction device, be configured to from described blocks of files, extract data based on described position data; Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
According to an embodiment of the present disclosure, described data extraction device is configured to read by row or column the data be stored in described blocks of files according to described query task.
According to method for reading data and the system of embodiment of the present disclosure, when carrying out data query, because data block to be stored in memory node and each blocks of files is provided with header file, fast query and the analysis of data can be realized.In addition, because all fields of identical data record cell are all at same node, such storage organization ensure that the data of a line of same data record unit are positioned at same node, and therefore the expense of tuple reconstruct is very low, ensure that search efficiency.In addition, when inquiring about, only reading row and the row of needs according to index file, decreasing the expense of network, improve search efficiency.
The explanation that other advantages of above-mentioned characteristic sum of the present disclosure pass through embodiment below will become clear.
Accompanying drawing explanation
Now by means of only the mode of example, with reference to appended accompanying drawing, embodiment of the present disclosure is described, wherein:
Fig. 1 (a) and Fig. 1 (b) respectively illustrates the storage organization schematic diagram that line stores and column stores of prior art;
Fig. 2 is the process flow diagram of the method for writing data according to exemplary embodiment of the present disclosure;
Fig. 3 performs the data store organisation figure according to the front and back of the method for writing data of exemplary embodiment of the present disclosure;
Fig. 4 is the data store organisation schematic diagram according to exemplary embodiment of the present disclosure;
Fig. 5 is the schematic diagram of the data writing systems according to exemplary embodiment of the present disclosure;
Fig. 6 is the process flow diagram of the method for reading data according to exemplary embodiment of the present disclosure; And
Fig. 7 is the schematic diagram of the data reading system according to exemplary embodiment of the present disclosure.
Embodiment
Now will be specifically described embodiment of the present disclosure by reference to the accompanying drawings.It should be noted that in accompanying drawing and may use same figure denote to similar unit or functional module.Appended accompanying drawing is only intended to embodiment of the present disclosure is described.Those skilled in the art can obtain alternate embodiments from following description on the basis of not departing from disclosure spirit and protection domain.
Embodiment of the present disclosure is described in detail below in conjunction with accompanying drawing.
As shown in Figure 2, according to an embodiment of the present disclosure, a kind of method for writing data is provided.The method comprises: in step S101, reads the data of predetermined row, to generate corresponding blocks of files.In step S102, be that each blocks of files creates header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block.In step S103, described data block is written to memory node.
The scheme of embodiment of the present disclosure can realize the advantage of the combination that column stores and line stores, and can carry out Efficient Compression to data, reduces carrying cost, saves storage space, and improves data query and analysis speed.In one embodiment, data can be structural data, semi-structured data (such as daily record etc.) or unstructured data.
In embodiment of the present disclosure, the data in blocks of files are read by row, and by blocks of files corresponding for the data genaration of reading.In one embodiment, predetermined row can be read, such as 20 row, 50 row, 100 row etc., by blocks of files corresponding for the data genaration of these predetermined row.In another embodiment, the size of blocks of files can be preset, read certain line number to generate predetermined blocks of files size.Adopt and have the following advantages in this way: owing to reading data by row, the data therefore in identical data record cell are created in a blocks of files usually.In one embodiment, in employing Hadoop distributed file system (HDFS), all fields are stored in same HDFS block.This assures the high adaptive faculty according to embodiment of the present disclosure with rapid data loading and dynamic load.In addition, because all fields of identical data record cell are all at same node, such storage organization ensure that the data of a line of same data record unit are positioned at same node, and therefore the expense of tuple reconstruct is very low.
In embodiment of the present disclosure, each blocks of files creates header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files.Adopt in this way, owing to creating header file, the file (or data) therefore stored in blocks of files can carry out effective location or inquiry by header file; In addition, header file also comprises each blocks of files from the index in database or memory node, therefore, when in multiple memory node during data query, can the position of quick position blocks of files, and search out relevant inquiring data with fast and easy.
According to embodiment of the present disclosure, read due to data line and generate corresponding blocks of files, therefore guarantee that the data of a line of same data record unit are positioned at same node, therefore data analysis has the high adaptive faculty of rapid data loading and dynamic load, and the expense of tuple reconstruct is very low in addition.Because blocks of files (or database) is by row write memory node, therefore the column data in raw data can be compressed according to pre-defined algorithm, effectively reduce storage space.Compression algorithm can be various frequently-used data compression algorithm of the prior art.
Fig. 3 performs the data store organisation figure according to the front and back of the method for writing data of exemplary embodiment of the present disclosure.On the left of Fig. 3, original data block is shown, the schematic diagram performing the data store organisation after according to embodiment write algorithm of the present disclosure is shown on the right side of Fig. 3.It should be noted that the object for signal, illustrate only the data of four field A, B, C, D, 5 row.As shown in the figure, read the data of predetermined row (being 5 row in the drawings), to generate corresponding blocks of files, each blocks of files creates header file, wherein said header file comprises each blocks of files self index in a database and (is exemplarily illustrated as the Sync of 16 bytes, index also can be the extended formatting arranged according to preset rules) and the index of file that stores in each blocks of files (in figure, be illustratively shown as tuple data head, can Data Position effectively in locating file block by tuple data head), and each blocks of files and corresponding header file composition data block (Row group1, Row group2 in figure, the rightmost side is the structural drawing of data block), described data block is written to memory node (Row group1, Row group2 ... be stored in by row in memory node).Design of the present invention can be more clearly understood by the structure of Fig. 3.
Fig. 4 shows the Data Data storage organization schematic diagram according to exemplary embodiment of the present disclosure.Fig. 4 shows inventive concept of the present disclosure.As shown in Figure 4, in distributed memory system, the block-based column storage mode of data acquisition.Can store and go the advantage stored by composite column thus.
According to an embodiment of the present disclosure, the data be created in respective file block can read by row or by row.In the embodiment shown in fig. 3, the predetermined row read from data carries out being stored in blocks of files with the form of row.In another embodiment, the predetermined row read from data carries out being stored in blocks of files with the form of row.In other words, due to can data effectively in locating file block by header file, the data therefore stored in a database be by row or store by row and all can realize object of the present disclosure.
In an embodiment of the present disclosure, the blocks of files (or data block) generated is written to memory node by row in turn.Multiple blocks of files (or data block) of predetermined number can be stored at a memory node.
In another embodiment of the present disclosure, the blocks of files (or data block) generated is written to memory node in turn by row.In one embodiment, in the blocks of files of a memory node storage 100,500,1000 or more.Should be understood that, the number of blocks of files can set according to the size of the storage space of memory node.Because blocks of files is by row write memory node.Adopt in this way, the data compression of row dimension can be utilized.Column data in raw data has identical data attribute usually, therefore blocks of files is stored by row, can inherit this advantage, is convenient to data to compress, thus significantly saves storage space.In addition, because blocks of files stores by row, therefore when inquiring about, unnecessary row can be skipped and read, improving data reading performance using redundancy.
According to an embodiment of the present disclosure, described data block is written to memory node and comprises: before being written to described memory node, the data corresponding to the data in described data block are compressed.Store after data compression, can space be saved.
According to an embodiment of the present disclosure, described data block is written to memory node and comprises: after the size of described data block reaches predetermined threshold, described data block is write multiple memory node in turn.Wherein, threshold value here can based on the size free setting of the size of database and memory node.Such as can be set to the size of tens, also can be hundreds of million.
According to second aspect of the present disclosure, also provide the data writing systems that a kind of and above-mentioned wiring method is corresponding.Fig. 5 is the schematic diagram of the data writing systems according to exemplary embodiment of the present disclosure.As shown in Figure 5, data writing systems 100 comprises: blocks of files generating apparatus 12, is configured to the data reading predetermined row, to generate corresponding blocks of files; Header file generating apparatus 16, be configured as each blocks of files and create header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block; And writing station 18, be configured to described data block to be written to memory node.
The advantage identical with above-mentioned method for writing data can be realized according to data writing systems of the present disclosure.In order to avoid repeating, its detailed description is omitted.
In addition, said method also has following variation.According to an embodiment of the present disclosure, the data in described blocks of files are being stored by row or by arranging the mode of carrying out reading.According to an embodiment of the present disclosure, said write device is configured to described data block to be written to memory node by column or row.According to an embodiment of the present disclosure, said write device comprises compression unit, and described compression unit is configured to: before described data block is written to described memory node, and the data corresponding to the data in described data block are compressed.According to an embodiment of the present disclosure, said write device is configured to further: after the size of described data block reaches predetermined threshold, and described data block is write multiple memory node in turn.
According to the third aspect of the present disclosure, also provide a kind of method for reading data.Fig. 6 is the process flow diagram of the method for reading data according to exemplary embodiment of the present disclosure.Method for reading data according to exemplary embodiment of the present disclosure comprises: in step S201, the query statement of analytic structure query language, with generated query task.In step S202, read head file, and the position data obtaining the file relevant to described query task from described header file.In step S203, based on described position data, from described blocks of files, extract data; Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
According to the method for reading data of embodiment of the present disclosure, when carrying out data query, because data block to be stored in memory node and each blocks of files is provided with header file, fast query and the analysis of data can be realized.In addition, because all fields of identical data record cell are all at same node, such storage organization ensure that the data of a line of same data record unit are positioned at same node, and therefore the expense of tuple reconstruct is very low, ensure that search efficiency.In addition, when inquiring about, only reading row and the row of needs according to index file, decreasing the expense of network, improve search efficiency.
According to an embodiment of the present disclosure, based on described position data, from described blocks of files, extract data comprise: according to described query task, the data be stored in described blocks of files by row or by row reading.
According to fourth aspect of the present disclosure, also provide a kind of data reading system.Fig. 7 is the schematic diagram of the data reading system according to exemplary embodiment of the present disclosure.As shown in Figure 7, the data reading system 200 according to exemplary embodiment of the present disclosure comprises: task generating device 202, is configured to analytic structure query language query statement with generated query task; Position acquisition device 204, is configured to read head file and from described header file, obtains the position data of the file relevant to described query task; And data extraction device 206, be configured to from described blocks of files, extract data based on described position data; Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
The advantage identical with above-mentioned knot method for reading data can be realized according to data reading system of the present disclosure.In order to avoid repeating, its detailed description is omitted.
According to an embodiment of the present disclosure, described data extraction device is configured to read by row or column the data be stored in described blocks of files according to described query task.
By describing above and instruction given in relevant drawings, of the present disclosure many modification given here and other embodiment will recognize by disclosure those skilled in the relevant art.Therefore, it being understood that embodiment of the present disclosure is not limited to disclosed embodiment, and modification and other embodiment are intended to comprise within the scope of the present disclosure.In addition, although more than to describe and relevant drawings is described example embodiment under the background of some example combination form of unit and/or function, but should be realized, the scope of the present disclosure can not deviated from by the various combination form of alternate embodiment providing unit and/or function.On this point, such as, be also expected with other array configuration of the different unit clearly described above and/or function and be within the scope of the present disclosure.Although be employed herein concrete term, they only use with general and descriptive implication and and are not intended to limit.

Claims (14)

1. a method for writing data, comprising:
Read the data of predetermined row, to generate corresponding blocks of files;
For each blocks of files creates header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block; And
Described data block is written to memory node.
2. wiring method according to claim 1, the data be wherein created in described respective file block can read by row or by row.
3. wiring method according to claim 1, is wherein written to memory node by described data block and comprises:
Described data block is written to memory node by column or row.
4. the wiring method according to any one of claim 1-3, is wherein written to memory node by described data block and comprises:
Before being written to described memory node, the data corresponding to the data in described data block are compressed.
5. the wiring method according to any one of claim 1-3, is wherein written to memory node by described data block and comprises:
After the size of described data block reaches predetermined threshold, described data block is write multiple memory node in turn.
6. a data writing systems, comprising:
Blocks of files generating apparatus, is configured to the data reading predetermined row, to generate corresponding blocks of files;
Header file generating apparatus, be configured as each blocks of files and create header file, the index of file that wherein said header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block; And
Writing station, is configured to described data block to be written to memory node.
7. writing system according to claim 6, is wherein created on data in described respective file block being stored by row or by arranging the mode of carrying out reading.
8. writing system according to claim 6, wherein said writing station is configured to described data block to be written to memory node by column or row.
9. the writing system according to any one of claim 6-8, wherein said writing station comprises compression unit, described compression unit is configured to before described data block is written to described memory node, and the data corresponding to the data in described data block are compressed.
10. the writing system according to any one of claim 6-8, said write device is configured to further: after the size of described data block reaches predetermined threshold, and described data block is write multiple memory node in turn.
11. 1 kinds of method for reading data, comprising:
The query statement of analytic structure query language, with generated query task;
Read head file, and the position data obtaining the file relevant to described query task from described header file; And
Based on described position data, from described blocks of files, extract data;
Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
12. method for reading data according to claim 11, wherein based on described position data, extract data and comprise from described blocks of files:
According to described query task, the data be stored in described blocks of files by row or by row reading.
13. 1 kinds of data reading systems, comprising:
Task generating device, is configured to analytic structure query language query statement with generated query task;
Position acquisition device, is configured to read head file, and obtains the position data of the file relevant to described query task from described header file; And
Data extraction device, is configured to from described blocks of files, extract data based on described position data;
Wherein said header file is created based on each blocks of files, and the index of file that described header file comprises each blocks of files self index in a database and stores in each blocks of files, and each blocks of files and corresponding header file composition data block, described data block is stored in memory node.
14. data reading systems according to claim 13, wherein said data extraction device is configured to according to described query task, can read the data be stored in described blocks of files by row or column.
CN201510131621.8A 2015-03-24 2015-03-24 Data processing method and system Pending CN104699815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510131621.8A CN104699815A (en) 2015-03-24 2015-03-24 Data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510131621.8A CN104699815A (en) 2015-03-24 2015-03-24 Data processing method and system

Publications (1)

Publication Number Publication Date
CN104699815A true CN104699815A (en) 2015-06-10

Family

ID=53346935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510131621.8A Pending CN104699815A (en) 2015-03-24 2015-03-24 Data processing method and system

Country Status (1)

Country Link
CN (1) CN104699815A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038188A (en) * 2017-12-11 2018-05-15 中国银行股份有限公司 A kind of document handling method and device
CN110008179A (en) * 2019-04-02 2019-07-12 深圳创维汽车智能有限公司 File memory method, automobile data recorder and readable storage medium storing program for executing
CN113094374A (en) * 2021-04-27 2021-07-09 广州炒米信息科技有限公司 Distributed storage and retrieval method and device and computer equipment
CN113468107A (en) * 2021-09-02 2021-10-01 阿里云计算有限公司 Data processing method, device, storage medium and system
CN117931095A (en) * 2024-03-21 2024-04-26 腾讯科技(深圳)有限公司 Map data storage method, apparatus, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178693A (en) * 2007-12-14 2008-05-14 沈阳东软软件股份有限公司 Data cache method and system
US20110276781A1 (en) * 2010-05-05 2011-11-10 Microsoft Corporation Fast and Low-RAM-Footprint Indexing for Data Deduplication
CN102332030A (en) * 2011-10-17 2012-01-25 中国科学院计算技术研究所 Data storing, managing and inquiring method and system for distributed key-value storage system
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178693A (en) * 2007-12-14 2008-05-14 沈阳东软软件股份有限公司 Data cache method and system
US20110276781A1 (en) * 2010-05-05 2011-11-10 Microsoft Corporation Fast and Low-RAM-Footprint Indexing for Data Deduplication
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN102332030A (en) * 2011-10-17 2012-01-25 中国科学院计算技术研究所 Data storing, managing and inquiring method and system for distributed key-value storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YONGQIANG HE 等: "RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems", 《2011 IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038188A (en) * 2017-12-11 2018-05-15 中国银行股份有限公司 A kind of document handling method and device
CN110008179A (en) * 2019-04-02 2019-07-12 深圳创维汽车智能有限公司 File memory method, automobile data recorder and readable storage medium storing program for executing
CN110008179B (en) * 2019-04-02 2023-06-16 深圳创维汽车智能有限公司 File storage method, automobile data recorder and readable storage medium
CN113094374A (en) * 2021-04-27 2021-07-09 广州炒米信息科技有限公司 Distributed storage and retrieval method and device and computer equipment
CN113468107A (en) * 2021-09-02 2021-10-01 阿里云计算有限公司 Data processing method, device, storage medium and system
CN117931095A (en) * 2024-03-21 2024-04-26 腾讯科技(深圳)有限公司 Map data storage method, apparatus, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US10331641B2 (en) Hash database configuration method and apparatus
CN110674154B (en) Spark-based method for inserting, updating and deleting data in Hive
CN102375853A (en) Distributed database system, method for building index therein and query method
CN102402602A (en) B+ tree indexing method and device of real-time database
CN103914483B (en) File memory method, device and file reading, device
CN104699815A (en) Data processing method and system
CN104239377A (en) Platform-crossing data retrieval method and device
CN103488687A (en) Searching system and searching method of big data
CN103902544A (en) Data processing method and system
CN103473276A (en) Storage method of very large data and distributed database system and retrieval method thereof
CN111258978A (en) Data storage method
CN104572862A (en) Mass data storage access method and system
CN104408067A (en) Multi-tree structure database design method and device
CN105022791A (en) Novel KV distributed data storage method
Shangguan et al. Big spatial data processing with Apache Spark
CN107741947B (en) Method for storing and acquiring random number key based on HDFS file system
CN105787090A (en) Index building method and system of OLAP system of electric data
CN102360359A (en) Data management device and data management method
CN104346347A (en) Data storage method, device, server and system
CN116756253B (en) Data storage and query methods, devices, equipment and media of relational database
KR101955376B1 (en) Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method
CN104809249A (en) Processing method and system of data structure
CN105005621A (en) Method for constructing distributed storage and parallel indexing system for big data
CN102955808A (en) Data acquisition method and distributed file system
KR101530441B1 (en) Method and apparatus for processing data based on column

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150610

WD01 Invention patent application deemed withdrawn after publication