CN103218365A - SS Table file data processing method and system - Google Patents

SS Table file data processing method and system Download PDF

Info

Publication number
CN103218365A
CN103218365A CN2012100185032A CN201210018503A CN103218365A CN 103218365 A CN103218365 A CN 103218365A CN 2012100185032 A CN2012100185032 A CN 2012100185032A CN 201210018503 A CN201210018503 A CN 201210018503A CN 103218365 A CN103218365 A CN 103218365A
Authority
CN
China
Prior art keywords
data
column
sstable file
row
line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012100185032A
Other languages
Chinese (zh)
Inventor
庄明强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN2012100185032A priority Critical patent/CN103218365A/en
Publication of CN103218365A publication Critical patent/CN103218365A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses an SS Table file data processing method and an SS Table file data processing system, wherein an SS Table file is provided with a framework table, a column sequence and column attribute information of row data of the SS Table file are defined in the framework table. The method comprises the following steps of reading row data to be written into the SS Table file; writing a row major key into the SS Table file, and writing each column data of the read row data corresponding to the row major key according to the column sequence and the column attribute information of the row data which is defined in the framework table of the SS Table file. When the method and the systemare adopted to carry out data processing on the SS Table file, data reading and writing processing can be carried out according to the framework table, so that storage of row data only is realized by only storing the row major key and column values, the column values are stored according to the prescribed sequence in the framework table, and information such as column names or line IDsis needed to be stored, so that data storage volume can be reduced.

Description

A kind of SSTable file data processing method and system thereof
Technical field
The application relates to communication technical field, particularly relates to a kind of SSTable file data processing method and system thereof.
Background technology
SSTable (Sorted String Table, sequencing character string table), it is a kind of file layout in essence, be used to store orderly Key-Value data to disk, each SSTable file is formed by a plurality of, writes in case finish, can not revise, can only read.
The form of SSTable commonly used is stored orderly Key-Value data in distributed data base at present, the Key-Value data are all stored with character string forms, comprise many row among each SSTable, the row of some constitutes a piece, every row comprises a major key (RowKey) and line data, line data generally comprises some row, and every row are to be listed as major key Key by name, and train value is Value.Fig. 1 shows a kind of line data storage format of SSTable file.
The SSTable file is generally only supported sparse storage format.When SSTable file writing line data, writing line major key at first writes then that train value is not empty row in this row, and train value does not take storage space for empty row do not write.This sparse storage mode, the number of columns that comprises in every row is uncertain, and identical row name is stored repeatedly in different row repeatedly, causes the waste of storage space, particularly row midrange amount is determined, and most row be the application of sky in every row.
Summary of the invention
The embodiment of the present application provides a kind of data processing method and system thereof based on the SSTable file layout, in order to solve the problem of existing SSTable file data memory mechanism waste storage space.
In a kind of SSTable file data processing method that the embodiment of the present application provides, be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, and this method comprises:
Read the line data that is written to the SSTable file;
Writing line major key in the SSTable file, and the row of the line data that defines in the framework table according to described SSTable file order and Column Properties information, corresponding described capable major key writes each column data of the line data that reads.
In the another kind of SSTable file data processing method that the embodiment of the present application provides, be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, and this method comprises:
Reception is to the data query request of SSTable file;
According to the framework table in the described SSTable file, the column data of inquiry is asked in inquiry;
Return inquiry response, wherein carry the column data that inquires.
In a kind of SSTable file data disposal system that the embodiment of the present application provides, be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Read module is used to read the line data that is written to the SSTable file;
Memory module is used at SSTable file writing line major key, and the row of the line data that defines in the framework table according to described SSTable file order and Column Properties information, and corresponding described capable major key writes each column data of the line data that reads.
In a kind of SSTable file data disposal system that the embodiment of the present application provides, be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Receiver module is used to receive the data query request to the SSTable file;
Enquiry module is used for the framework table according to described SSTable file, and the column data of inquiry is asked in inquiry;
Return module, be used to return inquiry response, wherein carry the column data that inquires.
In the foregoing description of the application, owing to be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, when the SSTable file is carried out data processing, carry out reading and writing data according to this framework table and handle, thereby the storage that has realized line data only needs storage line major key and train value, train value is deposited according to the order of stipulating in the framework table, needn't the memory row name or information such as row ID, reduced the storage data volume.
Description of drawings
Fig. 1 is the line data storage format synoptic diagram of SSTable file in the prior art;
The SSTable file data storage format synoptic diagram that Fig. 2 provides for the embodiment of the present application;
The storage format synoptic diagram of the structural table that Fig. 3 provides for the embodiment of the present application;
Fig. 4 writes the schematic flow sheet of data for what the embodiment of the present application provided to the SSTable file;
The schematic flow sheet that Fig. 5 provides for the embodiment of the present application from SSTable file reading of data;
One of SSTable file data disposal system structural drawing that Fig. 6 provides for the embodiment of the present application;
Two of the SSTable file data disposal system structural drawing that Fig. 7 provides for the embodiment of the present application;
Three of the SSTable file data disposal system structural drawing that Fig. 8 provides for the embodiment of the present application.
Embodiment
Existing SSTable file is only supported sparse form storage, but under many situations, business datum is dense structural data in fact, needs dense form storage.So-called dense form storage is meant: during the storage line data, do not store this train value for empty row in the row, be the row of sky, fill with the null value object, the every row in all will be stored a train value at once.Such as, move the business datum of coming from database MySql or Oracle, the number of columns of every table determines that the row change is not frequent, most ordered series of numbers is not a sky in the table, and business is is often read and write full line.For this application, adopt the sparse form storage mode of existing SSTable file, will cause bigger space waste, and read-write efficiency is not high.
For addressing the above problem, the embodiment of the present application is improved the employed storage format of SSTable file, and corresponding data storage flow process and the data query flow process improved.
Concrete, increased structural table in the SSTable file memory format of the embodiment of the present application, be used to describe the row that line data comprises, that is, this structural table is used for defining row order that respectively is listed as in the SSTable file line data and further defines Column Properties information.The relevant information of the relevant information of definable SSTable file data table and the row that line data comprised in the tables of data in this structural table.As, structural table has defined table name, Table I D, and row name, row ID, the data type of train value etc. of row that line data comprised, the order of row, every row in the table.Wherein, the order available sequences of row number identifies, also can be by the expression that puts in order of the relevant information of each row in the structural table.
Referring to Fig. 2, a kind of SSTable file memory format synoptic diagram of supporting dense storage mode that provides for the embodiment of the present application.
Wherein, the SSTable file uses piece to preserve orderly Key-Value data, and the size of each piece can dispose, and generally the size of each piece is between 8KB~64KB.
Whole SSTable file is stored the Key-Value data in order by major key Key, and wherein, Key is the row major key, and Value is a line data.Dense form is supported in the storage of line data: row major key Key+ train value array.Wherein, row major key Key is made up of row major key length+row major key character string, and the train value array is all values (value) of row of this row, and its quantity, order and data type are by the structural table definition of SSTable file, comprise the null value object, but do not comprise Column Properties information such as row name or row ID.For being empty train value, fill with the null value object, thereby guarantee that row order in every row and the definition in column data and the structural table are consistent.
The structural table head has been preserved the quantity that is listed as in the total table, structural table mainly defines the row order of every row and some attributes of row, such as the data type (as integer, character string type) of Table I D, row ID, train value etc., the row order that defines in the column data of storing in every row and the structural table and other attribute that is listed as are consistent.
The reference of the storage format of structural table realizes can be as shown in Figure 3.Wherein, structural table head (structural table Header) has write down the quantity that is listed as in the total table; Stored a plurality of table information in the structural table, Column Definition has defined the Table I D under the row, and row ID and train value type, and the train value type comprises numerical value, character string, time etc.; The structural table of all tables is stored together, and whole C olumn Definition array is by (Table I D, row ID) sequential storage.
Preferably, can also set up line index in each piece, the reference position of every row in piece write down in line index.Line index can be an array that comprises (line number amount+1) individual element, and each element is an int type (integer) numerical value, has write down the relativity shift value of every every trade prime minister for place piece starting point.The afterbody of last element directed last column because the length of not preserving every row, needs the reference position of adjacent row to subtract each other and obtains line length, and designing like this is in order to save storage space.Certainly, the form of line index is not limited in this, everyly can identify the position of line data in its place piece, can index the index data form of line data, all should be included in the application's the protection domain.
Owing in piece, set up line index, when data retrieval, can use binary chop at certain Key of the inner location rapidly of piece; And existing SSTable file is not supported the built-in line index of piece, need be in piece when data retrieval traversal queries.As seen, the embodiment of the present application can improve data retrieval efficient by line index.
Stored the line number amount of piece in the build portion, in piece, set up under the situation of line index, the information such as position of all right storage line index.
The SSTable file memory format that provides by the embodiment of the present application as can be seen, the storage of line data only needs storage line major key and train value array, train value is deposited according to the order of stipulating in the structural table, needn't memory row name or row ID, reduced the storage data volume.
Based on the SSTable file memory format that the embodiment of the present application provides,, be that unit writes with the full line when the SSTable file writes data.When whenever writing delegation, writing line major key at first, the row of the line data that defines in the structural table according to SSTable file orders and Column Properties information then write each column data of this line data to going major key, do not comprise the attribute information (as the row name, be listed as ID etc.) of row in the column data that writes.Each line data is stored with dense form: row major key Key+ train value array, for being empty train value, fill with the null value object.
Concrete, in writing the process of data line, behind the writing line major key, to going major key, the row order that defines in the structural table according to the SSTable file, write each column data of the line data that reads, the order of each column data in the line data that writes is with the row sequence consensus that defines in the structural table; Wherein, if the Column Properties of the respective column that defines in the Column Properties of column data to be written and the structural table is inconsistent, then the current line data write failure.
Further, after finishing the writing an of line data, set up line index at the piece at its place, line index is used for identifying the position of this line data at this piece, is the reference position of this line data in this piece as this line index.
Based on above data storage principle; Fig. 4 shows a kind of flow process that writes data to the SSTable file that the embodiment of the present application provides; this flow process only is a kind of specific implementation of the embodiment of the present application, and every flow process that can embody mentioned above principle all should be within the application's protection domain.
As shown in Figure 4, behind the line data that reads SSTable file to be written, this flow process can comprise:
Step 401 judges whether that all line data have write, if change step 406 over to; Otherwise, change step 402 over to;
Step 402, the writing line major key changes step 403 over to;
Step 403 judges whether this line data has been write, if then change step 401 over to; Otherwise, change step 404 over to;
Step 404, promptly whether consistent according to the defined row of the structural table of SSTable file with the definition of structural table to testing of current line when the prostatitis, if upcheck,, then change step 405 over to if upcheck; Otherwise current line writes failure, and flow process finishes;
Wherein, if consistent with the Column Properties of structural table definition, then can think and upcheck when the attribute in prostatitis.For example, when the data type in prostatitis is consistent with the definition of structural table, then upcheck when the prostatitis; Again for example, the current row sequence consensus that is listed in position in the line data to be written and structural table definition is then upchecked when the prostatitis.Can realize pressure verification to data to be written writing the SSTable file by this mode to avoid misdata.
Step 405 writes current column data to the SSTable file, changes step 403 then over to;
Step 406 writes metadata to the SSTable file, finishes this write data flow process then.
Wherein, the meta data definition in the embodiment of the present application can be consistent with definition of the prior art.
The flow process that writes data to the SSTable file that provides by the embodiment of the present application as can be seen, writing in the data procedures to the SSTable file, can write Key-Value according to original data type, the data of returning also are the data types of itself, need not carry out the conversion of " character string->numerical value " or " numerical value->character string ", make things convenient for professional use, shortened the running time, improved write operation efficient.Can also do the pressure type checking according to structural table in the ablation process,, improve the accuracy that data write to avoid that misdata is write SSTable.
The SSTable file memory format that provides based on the embodiment of the present application during from the required column data of SSTable inquiry, at first reads structural table from SSTable, ask the column data of inquiring about according to this structural table inquiry.
Concrete, inquiry ask can at first read the full line data in the process of column data of inquiry according to structural table, according to structural table and the column information inquired about of ask, filters out the column data that will inquire about again from the full line data then.
Further, ask according to structural table inquiry institute in the process of the column data inquired about, after filtering out the column data that to inquire about, judge that this column data is whether consistent with the attribute (as the data type of train value) of the respective column that defines in the structural table, if consistent, then with the column data that inquires as Query Result, otherwise, this inquires about failure, stops inquiry.For the non-existent situation of structural table that is listed in of inquiry, can handle according to application demand.
Further, ask inquiring also can the column data that inquire to be added up or analyzing and processing after the column data of inquiry according to the defined Column Properties of structural table (as the data type of train value).Owing to comprised the data type information of row in the structural table, in inquiry, can data be filtered and analyze easily according to data type, be the later data of certain time point such as only returning modification time.
Based on above data query principle; Fig. 4 shows a kind of flow process from SSTable file polling data that the embodiment of the present application provides; this flow process only is a kind of specific implementation of the embodiment of the present application, and every flow process that can embody mentioned above principle all should be within the application's protection domain.
As shown in Figure 5, after receiving the request that the data of SSTable file are inquired about, this flow process can comprise:
Step 501 reads the structural table in the SSTable file;
Step 502 is tested to the row of request inquiry according to the structural table that reads, and, judges that institute ask the row inquired about whether consistent with structural table, as if upchecking, then changes step 503 over to, otherwise finishes this querying flow that is;
Step 503 is judged whether line data that institute asks to inquire about has read to finish, if then finish this querying flow, otherwise change step 504 over to;
Step 504 is read in data line and traversal row; Wherein, when reading in line data, can navigate to line data fast according to line index and read;
Step 505 is judged whether column data has traveled through to finish, if then change step 503 over to, otherwise change step 506 over to;
Step 506, the traversal next column;
Step 507 judges according to query requests whether needs return the current column data that traverses, if then change step 508 over to, otherwise change step 505 over to; Wherein, if the current column data that traverses is the row of request inquiry, then need to return.
Step 508 is written to output queue with the column data of ask inquiry and prepares as Query Result output, and changes step 505 over to.
By the embodiment of the present application provide from the flow process of SSTable file polling data as can be seen, when reading full line, directly return the train value array, even just read some row in the row, full line can be read and do filtration again, be asked the column data of inquiry to obtain.Because the total amount of data of this dense storage mode storage is littler than sparse form data quantity stored, so also can be more efficient even filter.In addition, force verification according to structural table in the data query flow process of SSTable, thereby guarantee the consistance of column data and structural table.
Based on identical technical conceive, the embodiment of the present application also provides a kind of SSTable file data disposal system.
Referring to Fig. 6, the structural representation of the SSTable file data disposal system that provides for the embodiment of the present application wherein, is provided with the framework table in the SSTable file, and wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Read module 601 is used to read the line data that is written to the SSTable file;
Memory module 602 is used at SSTable file writing line major key, and the row of the line data that defines in the framework table according to described SSTable file order and Column Properties information, and corresponding described capable major key writes each column data of the line data that reads.
Concrete, memory module 601 is when the storage line data, and corresponding described capable major key, the row of the described SSTable file line data that define in the framework table according to the SSTable file write each column data of the line data that reads in proper order; Wherein, if the Column Properties of the respective column that defines in the Column Properties of column data to be written and the described framework table is inconsistent, then the current line data write failure.
Concrete, memory module 601 for being empty row, writes the null value object in the process of each column data that writes the line data that reads.
Further, after the memory module 601 writing line data, at the piece at described line data place, create the line index of described line data, described line index is used for identifying the position of described line data at this piece.
Referring to Fig. 7, the structural representation of the SSTable file data disposal system that provides for another embodiment of the application.Wherein, be provided with the framework table in the SSTable file, wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Receiver module 701 is used to receive the data query request to the SSTable file;
Enquiry module 702 is used for the framework table according to described SSTable file, and the column data of inquiry is asked in inquiry;
Return module 703, be used to return inquiry response, wherein carry the column data that inquires.
Concrete, enquiry module 702 according to the column information of the framework table in the described SSTable file and the inquiry of asking, reads the data of respective column in the target line data in the process of data query; Whether the attribute of judging the column data read consistent with the attribute of the respective column that defines in the described framework table, if consistent, then with the column data that inquires as Query Result, otherwise this inquires about failure.
Further, this system also can comprise analysis and processing module 704, be used for enquiry module 702 inquire ask the inquiry column data after, according to the defined Column Properties of framework table in the described SSTable file, the column data that inquires is added up or analyzing and processing.
Concrete, also comprising line index information in the described SSTable file, described line index is used for identifying the position of line data at its place piece.Accordingly, enquiry module 702 can be according to the framework table in the SSTable file, before the column data that inquiry asks to inquire about, determines the position of target line data according to described line index information, and reads the corresponding line data according to the position of determining.
As shown in Figure 8, in the specific implementation, SSTable file data disposal system comprises Fig. 6 and functional module shown in Figure 7 usually, that is, SSTable file data disposal system can either realize the storage of data usually, can realize the inquiry of data again.
In sum, comprised the structural table of similarity relation database in the SSTable storage format of the embodiment of the present application, defined table name in the structural table, Table I D, the row that row comprises in the table, the order of row, the row name of every row, row ID, train value type etc.; The storage of line data only needs storage line major key and train value array, and train value is deposited according to the order of stipulating in the structural table, needn't the memory row name or row ID, reduced the storage data volume; Writing fashionable is unit with the full line, can do the pressure verification according to the row sequential scheduling information of structural table definition, avoids misdata to write SSTable; When reading full line, directly return the train value array,, full line can be read and do filtration again even just read some row in the row.Because the total amount of data of this dense storage mode storage is littler than sparse form data quantity stored, so also can be more efficient even do filtration.SSTable supports the read-write of dense storage format, to the application based on the full line read-write, is the advantages of simplicity and high efficiency solution.
Should be realized that each module of the application system can be integrated in one, and also can separate deployment.Above-mentioned module can be merged into a module, also can further split into a plurality of submodules.
Through the above description of the embodiments, those skilled in the art can be well understood to the application and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better embodiment under a lot of situation.Based on such understanding, the part that the application's technical scheme contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in the storage medium, comprise that some instructions are with so that a computer equipment (can be a personal computer, server, the perhaps network equipment etc.) carry out the described method of each embodiment of the application.
It will be appreciated by those skilled in the art that accompanying drawing is the synoptic diagram of a preferred embodiment, module in the accompanying drawing or flow process might not be that enforcement the application is necessary.
The above only is the application's a preferred implementation; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the application's principle; can also make some improvements and modifications, these improvements and modifications also should be looked the application's protection domain.

Claims (10)

1. a sequencing character string table SSTable file data processing method is characterized in that, is provided with the framework table in the SSTable file, and wherein definition has the row order and the Column Properties information of described SSTable file line data, and this method comprises:
Read the line data that is written to the SSTable file;
Writing line major key in the SSTable file, and the row of the line data that defines in the framework table according to described SSTable file order and Column Properties information, corresponding described capable major key writes each column data of the line data that reads.
2. the method for claim 1 is characterized in that, the row order and the Column Properties information of the line data that defines in the described framework table according to the SSTable file, and corresponding described capable major key writes each column data of the line data that reads, comprising:
Corresponding described capable major key, the row of the described SSTable file line data that define in the framework table according to the SSTable file write each column data of the line data that reads in proper order; Wherein, if the Column Properties of the respective column that defines in the Column Properties of column data to be written and the described framework table is inconsistent, then the current line data write failure.
3. the method for claim 1 is characterized in that, in the process of each column data that writes the line data that reads, for being empty row, writes the null value object.
4. the method for claim 1 is characterized in that, also comprises after the writing line data:
At the piece at described line data place, create the line index of described line data, described line index is used for identifying the position of described line data at this piece.
5. a sequencing character string table SSTable file data processing method is characterized in that, is provided with the framework table in the SSTable file, and wherein definition has the row order and the Column Properties information of described SSTable file line data, and this method comprises:
Reception is to the data query request of SSTable file;
According to the framework table in the described SSTable file, the column data of inquiry is asked in inquiry;
Return inquiry response, wherein carry the column data that inquires.
6. method as claimed in claim 5 is characterized in that, and is described according to the framework table in the SSTable file, and the column data of inquiry is asked in inquiry, comprising:
According to the column information of the framework table in the described SSTable file and the inquiry of asking, read the data of respective column in the target line data;
Whether the attribute of judging the column data read consistent with the attribute of the respective column that defines in the described framework table, if consistent, then with the column data that inquires as Query Result, otherwise this inquires about failure.
7. method as claimed in claim 5 is characterized in that, asks inquiring also to comprise: according to the defined Column Properties of framework table in the described SSTable file, the column data that inquires is added up or analyzing and processing after the column data of inquiry.
8. as the arbitrary described method of claim 5-7, it is characterized in that also comprise line index information in the described SSTable file, described line index is used for identifying the position of line data at its place piece;
Described according to the framework table in the SSTable file, before the column data that inquiry institute asks to inquire about, also comprise: determine the position of target line data according to described line index information, and read the corresponding line data according to the position of determining.
9. a sequencing character string table SSTable file data disposal system is characterized in that, is provided with the framework table in the SSTable file, and wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Read module is used to read the line data that is written to the SSTable file;
Memory module is used at SSTable file writing line major key, and the row of the line data that defines in the framework table according to described SSTable file order and Column Properties information, and corresponding described capable major key writes each column data of the line data that reads.
10. a sequencing character string table SSTable file data disposal system is characterized in that, is provided with the framework table in the SSTable file, and wherein definition has the row order and the Column Properties information of described SSTable file line data, and this system comprises:
Receiver module is used to receive the data query request to the SSTable file;
Enquiry module is used for the framework table according to described SSTable file, and the column data of inquiry is asked in inquiry;
Return module, be used to return inquiry response, wherein carry the column data that inquires.
CN2012100185032A 2012-01-20 2012-01-20 SS Table file data processing method and system Pending CN103218365A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012100185032A CN103218365A (en) 2012-01-20 2012-01-20 SS Table file data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012100185032A CN103218365A (en) 2012-01-20 2012-01-20 SS Table file data processing method and system

Publications (1)

Publication Number Publication Date
CN103218365A true CN103218365A (en) 2013-07-24

Family

ID=48816168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012100185032A Pending CN103218365A (en) 2012-01-20 2012-01-20 SS Table file data processing method and system

Country Status (1)

Country Link
CN (1) CN103218365A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744617A (en) * 2013-12-20 2014-04-23 北京奇虎科技有限公司 Merging and compressing method and device for data files in key-value storage system
CN103888448A (en) * 2014-03-03 2014-06-25 珠海市君天电子科技有限公司 Method, device and system for data transmission and storage
CN105653587A (en) * 2015-12-21 2016-06-08 厦门市美亚柏科信息股份有限公司 Heterogeneous data cleaning method and system thereof
CN103812877B (en) * 2014-03-12 2016-10-12 西安电子科技大学 Data compression method based on Bigtable distributed memory system
WO2017020576A1 (en) * 2015-07-31 2017-02-09 华为技术有限公司 Method and apparatus for file compaction in key-value storage system
CN108875082A (en) * 2018-07-17 2018-11-23 北京奇安信科技有限公司 A kind of Large Volume Data read-write processing method and device
CN110222046A (en) * 2019-04-28 2019-09-10 阿里巴巴集团控股有限公司 Processing method, device, server and the storage medium of table data
CN110874358A (en) * 2018-08-30 2020-03-10 阿里巴巴集团控股有限公司 Multi-attribute column storage and retrieval method and device and electronic equipment
CN111121683A (en) * 2019-12-05 2020-05-08 山西裕鼎精密科技有限公司 Data processing apparatus, method and computer storage medium
CN112291359A (en) * 2020-11-02 2021-01-29 浙江苍南仪表集团股份有限公司 Data transmission format processing method, system, device and readable storage medium
CN112612786A (en) * 2020-11-24 2021-04-06 北京思特奇信息技术股份有限公司 Large-data-volume row-column conversion method and system
CN113312338A (en) * 2021-06-29 2021-08-27 中国农业银行股份有限公司 Data consistency checking method, device, equipment, medium and program product
CN114625805A (en) * 2022-05-16 2022-06-14 杭州时代银通软件股份有限公司 Method, device, equipment and medium for configuration of return test

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007098055A2 (en) * 2006-02-17 2007-08-30 Google Inc. Encoding and adaptive, scalable accessing of distributed models
CN101110074A (en) * 2007-01-30 2008-01-23 浪潮乐金信息系统有限公司 Data speedup query method based on file system caching
CN101620636A (en) * 2009-08-21 2010-01-06 腾讯科技(北京)有限公司 Method and apparatus for displaying tabular data
CN102129458A (en) * 2011-03-09 2011-07-20 胡劲松 Method and device for storing relational database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007098055A2 (en) * 2006-02-17 2007-08-30 Google Inc. Encoding and adaptive, scalable accessing of distributed models
CN101110074A (en) * 2007-01-30 2008-01-23 浪潮乐金信息系统有限公司 Data speedup query method based on file system caching
CN101620636A (en) * 2009-08-21 2010-01-06 腾讯科技(北京)有限公司 Method and apparatus for displaying tabular data
CN102129458A (en) * 2011-03-09 2011-07-20 胡劲松 Method and device for storing relational database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
楚材: "《OB数据模型介绍》", 《HTTP://CODE.TAOBAO.ORG/P/OCEANBASE/WIKI/MODULE/》, 29 September 2011 (2011-09-29) *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744617A (en) * 2013-12-20 2014-04-23 北京奇虎科技有限公司 Merging and compressing method and device for data files in key-value storage system
CN103744617B (en) * 2013-12-20 2016-09-28 北京奇虎科技有限公司 The merging compression method of a kind of key-value storage Data File and device
CN103888448A (en) * 2014-03-03 2014-06-25 珠海市君天电子科技有限公司 Method, device and system for data transmission and storage
CN103812877B (en) * 2014-03-12 2016-10-12 西安电子科技大学 Data compression method based on Bigtable distributed memory system
CN106407224B (en) * 2015-07-31 2019-09-13 华为技术有限公司 The method and apparatus of file compacting in a kind of key assignments storage system
US11232073B2 (en) 2015-07-31 2022-01-25 Huawei Technologies Co., Ltd. Method and apparatus for file compaction in key-value store system
CN106407224A (en) * 2015-07-31 2017-02-15 华为技术有限公司 Method and device for file compaction in KV (Key-Value)-Store system
WO2017020576A1 (en) * 2015-07-31 2017-02-09 华为技术有限公司 Method and apparatus for file compaction in key-value storage system
CN105653587B (en) * 2015-12-21 2019-02-19 厦门市美亚柏科信息股份有限公司 Heterologous isomeric data cleaning method and its system
CN105653587A (en) * 2015-12-21 2016-06-08 厦门市美亚柏科信息股份有限公司 Heterogeneous data cleaning method and system thereof
CN108875082A (en) * 2018-07-17 2018-11-23 北京奇安信科技有限公司 A kind of Large Volume Data read-write processing method and device
CN110874358A (en) * 2018-08-30 2020-03-10 阿里巴巴集团控股有限公司 Multi-attribute column storage and retrieval method and device and electronic equipment
CN110874358B (en) * 2018-08-30 2023-05-05 阿里巴巴集团控股有限公司 Multi-attribute column storage and retrieval method and device and electronic equipment
CN110222046A (en) * 2019-04-28 2019-09-10 阿里巴巴集团控股有限公司 Processing method, device, server and the storage medium of table data
CN110222046B (en) * 2019-04-28 2023-11-03 北京奥星贝斯科技有限公司 List data processing method, device, server and storage medium
CN111121683A (en) * 2019-12-05 2020-05-08 山西裕鼎精密科技有限公司 Data processing apparatus, method and computer storage medium
CN112291359A (en) * 2020-11-02 2021-01-29 浙江苍南仪表集团股份有限公司 Data transmission format processing method, system, device and readable storage medium
CN112291359B (en) * 2020-11-02 2022-07-26 浙江苍南仪表集团股份有限公司 Data transmission format processing method, system, device and readable storage medium
CN112612786A (en) * 2020-11-24 2021-04-06 北京思特奇信息技术股份有限公司 Large-data-volume row-column conversion method and system
CN113312338A (en) * 2021-06-29 2021-08-27 中国农业银行股份有限公司 Data consistency checking method, device, equipment, medium and program product
CN114625805A (en) * 2022-05-16 2022-06-14 杭州时代银通软件股份有限公司 Method, device, equipment and medium for configuration of return test
CN114625805B (en) * 2022-05-16 2022-09-20 杭州时代银通软件股份有限公司 Return test configuration method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN103218365A (en) SS Table file data processing method and system
US7243110B2 (en) Searchable archive
JP7257068B2 (en) Systems, methods, and data structures for fast searching or filtering of large datasets
US9652467B2 (en) Inline tree data structure for high-speed searching and filtering of large datasets
CN107016001B (en) Data query method and device
CN110046168B (en) Incremental data consistency implementation method and device
CN108205577B (en) Array construction method, array query method, device and electronic equipment
CN105279213A (en) Retrieval device and retrieval method for log database
US9406018B2 (en) Systems and methods for semantic data integration
CN103714096A (en) Lucene-based inverted index system construction method and device, and Lucene-based inverted index system data processing method and device
CN103840969A (en) Alarm log management method and system in cloud computing system
WO2014106418A1 (en) Method and apparatus for storing and reading files
CN101963993B (en) Method for fast searching database sheet table record
CN110399396B (en) Efficient data processing
KR101440475B1 (en) Method for creating index for mixed query process, method for processing mixed query, and recording media for recording index data structure
CN114185934A (en) Indexing and query method and system based on Tiandun database column storage
CN112364007B (en) Mass data exchange method, device, equipment and storage medium based on database
CN113625967B (en) Data storage method, data query method and server
CN113377721B (en) File table design method for storing files in database
CN115809268B (en) Adaptive query method and device based on fragment index
CN115309742A (en) Table building method and device, electronic equipment and storage medium
CN114385584A (en) Data writing method and data reading method
CN113297040A (en) Method and apparatus for determining insight data, computer storage medium, and electronic device
JP2007323372A (en) Business document management system, business document management device, business document management method, business document management program and storage medium
CN114281814A (en) Data duplicate checking method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1183541

Country of ref document: HK

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130724