CN102999519B - Read-write method and system for database - Google Patents
Read-write method and system for database Download PDFInfo
- Publication number
- CN102999519B CN102999519B CN201110273321.5A CN201110273321A CN102999519B CN 102999519 B CN102999519 B CN 102999519B CN 201110273321 A CN201110273321 A CN 201110273321A CN 102999519 B CN102999519 B CN 102999519B
- Authority
- CN
- China
- Prior art keywords
- data
- write
- node
- replica node
- reading
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a read-write method and system for a database. The method comprises the steps of: transversely cutting record data according to a main keyword into a plurality of data segments, wherein each data segment is stored as a write copy and corresponding read copies, and the write copy is stored in form of line storage, so that the write performance of the database is optimized; the read copies are stored in form of column storage, and data in each read copy is organized in different manners, so that the read performance of the database is optimized. According to the method, an overall index and a local index are further established, so that the operating position of data to be written or to be read can be quickly positioned when data is written or read. According to the embodiment of the method or system, not only is data quickly written, but also data can be quickly read.
Description
Technical field
The present invention relates to field of computer data processing, the reading/writing method and system of more particularly to a kind of data base.
Background technology
With the development of society and science and technology, computer has obtained increasingly extensive application in daily life and work.Such as
The present, user needed to carry out many data storage process by computer due to daily life and requirements of one's work, and data base
System is exactly a kind of application system that can realize the above, is the one kind grown up to adapt to the needs of data processing
The core institution of ideal data processing.In the daily life and work of reality, user is required to convenient to data
Storehouse carries out data storage and data query etc. and accesses operation.
Distributed data base system is the one kind in Database Systems, is developed on the basis of centralized data base system
Get up, it is the product that computer technology and network technology are combined, including client, metadata node and data storage section
Point.On service application service device, business application is by client to distributed memory system collection for wherein client deployment
Mass-send out data access request;Metadata node is used for depositing metadata information;Data memory node is used for depositing data block.
The data base read-write method of prior art, the data storage format of its data base being written and read has two kinds.It is a kind of
Using row storage, the row data base based on capable storage mode could accelerate the read-write for data because of the outside index of dependence,
And the maintenance of outside index can consume the substantial amounts of time and space, so when can consume substantial amounts of when being written and read to row data base
Between and space resources;Another kind will not be consumed using row storage, row storage because having from the high characteristic of straw line compression ratio
Substantial amounts of time and space resource, but this storage mode can cause write, and data line needs to carry out multiple disk operating
As a result, so write performance is low.
Therefore, under the huge background of modern data, a kind of reading/writing method of new data base how is provided, can
Realizing the no write de-lay of data can realize the quick reading of data again, be the technical problem that prior art is badly in need of solving.
The content of the invention
In view of this, the invention provides the reading/writing method and system of a kind of data base, to overcome prior art in due to
The low problem of caused data base write degraded performance or reading performance using single storage mode.
For achieving the above object, the present invention provides following technical scheme:
Record data is many numbers according to major key transversally cutting by a kind of reading/writing method of data base, metadata node
According to section, each data segment saves as a write copy and corresponding many parts of readings copy, wherein write copy is deposited using row
Storage form is stored, and is read copy and is stored using row storage form, each to read in copy data in different ways
Tissue, the method includes:
Receive the access request that client is initiated;
In the case where access request is data write operation:
Major key in the global index and access request preserved in metadata node determines data need to be written
The data interval of write and write replica node corresponding with the data interval, the global index is used to indicate major key
With the corresponding relation between data interval and write replica node corresponding with the data interval;
The write replica node of write is needed to initiate operation requests to data to be written, said write replica node will update
Data supplementing is write in its increment block, and the increment block is the disk file that updates the data of record, described to update the data as preset
The set of data to be written in bar number;
In the case where access request is data read operation:
Judge whether there is major key in access request, if major key, then according to preservation in metadata node
Global index and the major key determine continue fetch data place data interval and the data interval it is corresponding read it is secondary
This node, and in the case where there is other filterconditions, one is determined with other filterconditions most by the reading replica node
The local index of matching, without in the case of other filterconditions, then an arbitrarily selected local index, the local index
For indicating the corresponding relation of key word and memory block, the memory block is fast for the least unit of data storage;
If without major key, access request is sent into current meta data node all of data interval, and
In the case where there is other filterconditions, by all of data interval it is corresponding read replica node determine one and other
The local index that filtercondition is most matched, without in the case of other filterconditions, then an arbitrarily selected local index;
It is determined that the memory block and reading replica node corresponding with the memory block being likely located at of fetching data that continue, and
The access request is sent to into each and reads replica node;
The replica node that reads judged with the presence or absence of the updating the data of fetching data of continuing in its increment block, if
It is just to read to continue from increment block and fetch data, if it is not, then reading in the memory block indicated from the local index to be read
Data, the increment block is the disk file that record is updated the data.
Wherein, the method for building up of the global index includes:
All of write copy is sampled, and the data that sampling is obtained are ranked up according to major key;
To the data demarcation interval after sequence, and distribute corresponding start node for the data interval after division, formed just
Beginning global index;
The initial global index is sent to into each write replica node, so that described each write replica node foundation
Corresponding relation distributed data between the data interval and start node;
Metadata node receives the data distribution result that each write replica node is returned, and ties according to the data distribution
Fruit divides memory block, formulates the plan of data interval balance dispatching and is sent to each write replica node, notifies that each write is secondary
This node is ranked up according to the data interval balance dispatching plan, and the memory block is fast for the least unit of data storage;
Each write replica node start node internal sort;
Receive what each write replica node sent, the result of the sequencing and scheduling carried out in units of memory block is set up
Relation between the major key and the data interval and write replica node corresponding with the data interval.
Wherein, the method for building up of the local index includes:
It is that the corresponding reading of data interval distribution in the global index is secondary in the case where global index has built up
This node;
The above-mentioned relations of distribution are sent to into each write replica node, the data that said write replica node stores itself
Record sends to corresponding and reads replica node storage;
Trigger each reading replica node to be ranked up the data in reading copy according to the Sorted list specified;
The data after sequence are preserved in units of memory block, keyword is set up with the memory block and corresponding with the memory block
Reading replica node between relation.
Wherein, methods described also includes creating the operation of filter, and whether the filter continues for judgement fetches data
In certain memory block.
Wherein, methods described also includes:
Described updating the data is sent to corresponding reading replica node by write replica node;
The reading replica node updates the data batch write increment block by described.
Wherein, obtain to continue in the memory block indicated from the local index and fetch data, specifically also include:
When access request is that occurrence is inquired about and inquiry is classified as keyword, replica node application filter is read to specifying
Memory block filtered,
Record data is many numbers according to major key transversally cutting by a kind of read-write system of data base, metadata node
According to section, each data segment saves as a write copy and corresponding many parts of readings copy, wherein write copy is deposited using row
Storage form is stored, and is read copy and is stored using row storage form, each to read in copy data in different ways
Tissue, the system includes:
Metadata node, is data write behaviour in access request for making requests on judgement to the access that client is initiated
In the case of work:
The data that data to be written are located are determined according to the major key in global index and access request that itself is preserved
Write replica node interval and corresponding with the data interval;
The write replica node of write is needed to initiate operation requests to data to be written;
In the case where access request is data read operation:
Judge whether there is keyword in access request, if major key, then according to the global index for itself preserving and
The major key determines the corresponding reading replica node of data interval and the data interval at place of fetching data that continues;
If without major key, access request is sent to current meta data node into all of data interval;
It is determined that the memory block and reading replica node corresponding with the memory block being likely located at of fetching data that continue, and
The access request is sent to into each and reads replica node;
Write replica node, writes for after the operation requests for receiving metadata node initiation, will update the data to add
In entering its increment block;
Replica node is read, in the case where there are other filterconditions, determining one with other filterconditions most
The local index matched somebody with somebody, in the case of without other filterconditions, then an arbitrarily selected local index, determines described to be read
The memory block that data are likely located at, judges with the presence or absence of the updating the data of fetching data of continuing in its increment block, if it is, just
Read to continue from increment block and fetch data, if it is not, then obtain to continue in the memory block indicated from the local index fetching data.
Wherein, the metadata node is additionally operable to:All of write copy is sampled, and to sampling
The data for obtaining are ranked up according to major key;
To the data demarcation interval after sequence, and distribute corresponding start node for the data interval after division, formed just
Beginning global index;
The initial global index is sent to into each write replica node, so that described each write replica node foundation
Corresponding relation distributed data between the data interval and start node;
The data distribution result that each write replica node is returned is received, and data are divided according to the data distribution result
Block, formulates the plan of data interval balance dispatching and is sent to each write replica node;
Receive what each write replica node sent, the result of the sequencing and scheduling carried out in units of memory block is set up
Relation between the major key and the data interval and write replica node corresponding with the data interval;
Said write replica node is additionally operable to:
The initial global index sent according to metadata node is carrying out data distribution;
The data interval balance dispatching formulated according to metadata node is planned to be ranked up in node and is entered between node
Row scheduling.
Wherein, the metadata node is additionally operable to:For the data interval distribution correspondence in the global index
Reading replica node;
The above-mentioned relations of distribution are sent to into each write replica node;
Trigger each reading replica node to be ranked up the data in reading copy according to the Sorted list specified;
Said write replica node is additionally operable to:
After receiving the relations of distribution that the metadata node is sent, the data record for itself storing is sent to corresponding
Read replica node storage;
The reading replica node is additionally operable to:
The data after sequence are preserved in units of memory block, keyword is set up with the memory block and corresponding with the memory block
Reading replica node between relation.
Wherein, the reading replica node is additionally operable to:
Create filter.
Wherein, said write replica node is additionally operable to:
Described updating the data is sent to into corresponding reading replica node;
The reading replica node is additionally operable to:
Batch write increment block is updated the data by described.
Wherein, the reading replica node is additionally operable to:
When access request is that occurrence is inquired about and inquiry is classified as keyword, specified memory block is carried out using filter
Filter.
Understand via above-mentioned technical scheme, compared with prior art, the invention discloses a kind of read-write side of data base
Record data transversally cutting is multiple data segments by method and system, the method, and each data segment saves as a write copy
With corresponding many parts reading copies, wherein write copy is stored using row storage form, the write of data base is optimized
Can, read copy and stored using row storage form, the reading performance of data base is optimized, the method has also set up global rope
Draw and local index, the position that access request needs to access quickly is positioned using global index and local index, using this system
The no write de-lay of data can either be realized, it is also possible to realize the quick reading of data.
Description of the drawings
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this
Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, can be with basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is system structure diagram disclosed in the embodiment of the present invention;
Fig. 2 is data write operation schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 3 is to set up global index's schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 4 is data read operation schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 5 is to set up local index schematic flow sheet disclosed in the embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
Embodiment one
Fig. 1 is the structural representation of the embodiment of the present invention each node in actual applications, wherein, a metadata node
Include n write replica node down, and write replica node correspondence m reads replica node, wherein, n and m are nature
Number.For example, for write replica node 1, its corresponding m is read replica node and is respectively:Reading replica node 1-1,
Read replica node 1-2...... and read replica node 1-m.In actual applications, the metadata node in Fig. 1 is first by data
Record data transversally cutting in storehouse is multiple data segments, then each data segment is stored in into a write replica node
In, these write replica nodes are stored in the form of row storage to data, and facilitate implementation user is carried out soon to data base
Fast write operation.Metadata node, further according to the feature of data, is to write in replica node under each write replica node
Data segment distribute multiple reading replica nodes, then the data segment in each write replica node is copied to and the write pair
This node is corresponding to be read in replica node, and these are read replica node data are stored in the form of row storage, and
Different sequential organization data are adopted between each reading replica node under each write replica node, in order to realize user couple
Data base carries out fast reading operations.
Shown in Figure 2, Fig. 2 is the embodiment flow chart for realizing method for writing data of the present invention, and concrete steps can be as
Under:
Step 201:The access request that client is initiated is received, the access request is data write operation.
In this step, the data write operation that client is initiated is received by metadata node.
Step 202:Major key in the global index and access request preserved in metadata node determines to be written
Enter data interval and write replica node corresponding with the data interval that data need to write, the global index is used to indicate
Corresponding relation between major key and data interval and write replica node corresponding with the data interval.
In this step, metadata node preserves in itself global index because have recorded in global index major key with
The corresponding relation of data interval, so when metadata node receives the data write operation with primary keyword, Neng Gougen
Accordingly corresponding relation determines which data interval is data to be written should write, and then metadata node is further according in global index
The data interval of record determines data write operation should be in which write copy section with the corresponding relation of write replica node
Point is carried out.
Wherein, the method that the global index sets up can be found in Fig. 3, and its step is specific as follows:
Step 301:Metadata node is sampled to all of write copy, and the data obtained to sampling are according to main pass
Key word is ranked up.
In this step, metadata node is sampled to the record data in all write replica nodes, and sample proportion can
To be defined by the user, then the data that sampling is obtained are ranked up by comparison element of major key.
Step 302:To the data demarcation interval after sequence, and distribute corresponding initial section for the data interval after division
Point, forms initial global index.
It is by sorted data demarcation interval and initial for ready-portioned each interval distribution one in this step
Node, store in each start node sampling get it is corresponding interval in data, material is thus formed one not
The initial global index of optimization.
Step 303:The initial global index is sent to into each write replica node, so that described each write copy
Node is according to the corresponding relation distributed data between the data interval and start node.
In this step, the initial global index formed in step 302 is sent to into each write replica node, each write
Replica node is received after the initial global index, by the major key and initial global rope of each data for itself recording
Interval in drawing is compared, determine the data record should be located at which data interval, then send the data to really
The corresponding write replica node storage of fixed data interval.Meanwhile, all have recorded each data interval in each write replica node
The number of data record.
Step 304:Metadata node receives the data distribution result that each write replica node is returned, and according to the number
Memory block is divided according to distribution results, the plan of data interval balance dispatching is formulated and is sent to each write replica node, notify each
Individual write replica node is ranked up according to the data interval balance dispatching plan, and the memory block is the minimum of data storage
Unit is fast.
In this step, the data distribution result carried out according to initial global index is sent to unit by each write replica node
Statistical computation is done in back end, data record distribution of the metadata node to each data interval, then according to this result of calculation
With the memory block internal memory of administrator configurations, memory block is divided, formulates the data interval balance dispatching plan in units of memory block,
This plan is sent to into each write replica node and notifies that each write replica node is ranked up scheduling according to this plan.
Step 305:Dispatch between each write replica node start node internal sort and node.
In this step, each write replica node is arranged according to data interval balance dispatching plan in units of memory block
Sequence, and the memory block for needing to be sent to other write replica nodes is sent to into according to plan specified memory block.
Step 306:Metadata node receives what each write replica node sent, the sequence carried out in units of memory block
With the result of scheduling, set up the major key and the data interval and write replica node corresponding with the data interval it
Between corresponding relation.
In this step, what each write replica node generated step 305, the sequence and scheduling knot in units of memory block
Fruit is sent to metadata node, metadata node record the major key of each memory block and the data interval and with the number
According to the corresponding relation between the corresponding write replica node in interval, a global index is formed.
Step 203:The write replica node of write is needed to initiate operation requests, the write replica node to data to be written
To update the data to add and write in its increment block, the increment block will be the disk file that record is updated the data, described to update the data
For the set of data to be written in preset bar number.
In this step, metadata node to step 202 determines, data to be written need the write replica node of write to send out
Data write operation request is played, the write replica node is received after request, will update the data and its increasing is write in the way of adding
Gauge block, completes the write of data.
Wherein, also the operation of the reading replica node for being sent to corresponding will be updated the data including write replica node.
In the present embodiment, first data write operation request is received by metadata node, then determined according to global index
Data to be written need the data interval of write, then access request is sent to into write copy section corresponding with the data interval
Point, by the write replica node data write operation is carried out, and it is using row storage shape to write the data record in replica node
What formula was stored, the write performance of data base is optimized, therefore using the method for writing data embodiment of the present invention, Neng Goushi
The no write de-lay of existing data.
Embodiment two
Fig. 1 is the structural representation of the embodiment of the present invention each node in actual applications, the wherein application of each node
Function and feature can refer in embodiment one with regard to the description of Fig. 1.
Shown in Figure 4, Fig. 4 is the embodiment flow chart for realizing method for reading data of the present invention, and concrete steps can be as
Under:
Step 401:Receive the access request of client initiation, the access request is data read operation.
In this step, the data read operation of client initiation is received by metadata node.
Step 402:Judge whether there is major key in access request, if it has, then execution step 403;If it is not,
Execution step 404.
In this step, judge whether contain major key in access request by metadata node.
Step 403:Global index and the major key according to preserving in metadata node determines institute of fetching data of continuing
The corresponding reading replica node of data interval and the data interval.
In this step, metadata node preserves in itself global index because have recorded in global index major key with
The corresponding relation of data interval, so when metadata node receives the data read operation with primary keyword, Neng Gougen
Accordingly corresponding relation determines continue to fetch data which data interval be located at, and then metadata node is further according in global index
The data interval of record determines that data read operation should be carried out in which replica node with the corresponding relation for reading replica node.
Step 404:Access request is sent into current meta data node all of data interval.
In this step, without in the case of major key in access request, metadata node not can determine that number to be read
According to positioned at which data interval, so access request is sent to all of data interval.
Step 405:Judge whether containing other filterconditions in access request, if it has, then execution step 406;If
No, then execution step 407.
In this step, whether judged in access request containing other filterconditions by reading replica node.
Step 406:One local index most matched with other filterconditions is determined by the reading replica node, it is described
Local index is used to indicate the corresponding relation of key word and memory block that the memory block to be fast for the least unit of data storage.
In this step, in the case of it is determined that there is other filterconditions in access request, the reading replica node is selected
One is used to filter with the local index that other filtercondition contents are most matched.
Wherein, the method for building up of the local index can be found in Fig. 5, and its step is specific as follows:
Step 501:Judge whether global index sets up, if it is, into step 503;If it is not, then into step
Rapid 502.
In this step, judge whether global index has set up by metadata node.
Step 502:Set up global index.
In this step, in the case where global index is not set up, metadata node is built firstly the need of global index is set up
Cube method can refer to the method for building up of global index in embodiment one.
Step 503:Metadata node is that the data interval in the global index distributes corresponding reading replica node.
In this step, in the case where global index has built up, metadata node, according to the characteristic of record data, is complete
Each data interval in office's index distributes multiple reading replica nodes.
Step 504:The above-mentioned relations of distribution are sent to into each write replica node, said write replica node deposits itself
The data record of storage sends to corresponding and reads replica node storage.
In this step, the relations of distribution described in step 504 are sent to each write replica node by metadata node, and each is write
Enter replica node to receive after the relations of distribution, the data record for itself storing is copied to into each corresponding reading secondary
This node.
Step 505:Metadata node triggers each and reads replica node according to the Sorted list specified to reading replica node
In data be ranked up.
In this step, metadata node triggers each reading replica node data record is carried out according to different keyword
Sequence.
Step 506:Read replica node and the data after sequence are preserved in units of memory block, set up keyword and deposit with described
Storage block and the relation read between replica node corresponding with the memory block.
In this step, each reads what is generated in copy section storing step 505, the ranking results in units of memory block,
Record between the keyword of each memory block and the data interval and reading replica node corresponding with the data interval
Corresponding relation, forms a local index.
Wherein, also include creating the operation of filter while local index is set up, the filter is treated for judgement
Read whether data are located in certain memory block.
Each filter corresponds to a single memory block, can interpolate that to continue to fetch data by filter and whether there is
In certain memory block.Its insertion method is:
Prepare the bit group that a length is m in advance, the value of m is expected to be 20 times of memory block element or so, in bit group
Portion's element initial value is 0.When a line record is added in memory block, using the rope of the k different useless function pair record
Draw row to be calculated, the codomain of result of calculation is in [0, m], with this k result of calculation as index, by corresponding unit in bit group
Element is set to 1.
Step 407:An arbitrarily selected local index.
In this step, in the case of without other filterconditions, the reading replica node not can determine that one locally
Index for filtering, so an arbitrarily selected local index is used to filter.
Step 408:It is determined that the memory block and reading pair corresponding with the memory block being likely located at of fetching data that continue
This node, and the access request is sent to into each reading replica node.
By above-mentioned steps, according to it has been determined that data interval and local index, it is determined that the possibility of fetching data that continues
The memory block being located at, determines reading replica node corresponding with the memory block, it is determined that the reading further according to local index
After replica node, access request is sent to the reading replica node by metadata node.
Wherein, the reading replica node is concrete in access request after the access request for receiving metadata node
In the case that value is inquired about and inquiry is classified as index column, follow the steps below:
(A), access request application filter is filtered to specified memory block and increment block, exclude institute either with or without
Continue the memory block fetched data;
Judgement continues to fetch data with the presence or absence of the method in certain memory block:
Continue to fetch data using above-mentioned k different useless function pair and calculated, draw k result of calculation;
This k result of calculation go in bit group inquiry as being indexed, is 0 if there is an element value, explanation is treated
Read data not exist in this memory block;
If not existing for 0 element value in bit group, illustrate to continue to fetch data there may be in this memory block, also need
Further to compare differentiation.
(B), according to other filterconditions, you and local index are positioned to data, are obtained continuing and are fetched data in memory block
In sequence number, be designated as R;
(C), obtain continuing according to local index sequence number R that obtains of inquiry and inquiry fetch data place memory block position
Put, prepare to read data;
(D), operate below executed in parallel, opening continues the file at place of fetching data, by search index to the maximum less than R
Sequence number, obtains the corresponding document misregistration of the sequence number and navigates to the position, the element being successively read in file, until the R it is first
Till element.
Step 409:The reading replica node judges to whether there is the renewal number for fetching data that continues in its increment block
According to if it is, execution step 410;If it is not, then execution step 411, the increment block is the disk text that record is updated the data
Part.
Wherein, updating the data in the reading replica node increment block, is by corresponding with the reading replica node
Write replica node sends and comes.
The reading replica node is when the record strip scalar product for updating the data is tired out and reaches threshold value, then data procession is turned
Change, row storage is changed into from row storage, while removable partial compression is carried out to each column data, by the batch data write magnetic after conversion
Disk, because the change of the increment block accumulation of the reading back end is big, can reduce the query performance of system, it is therefore desirable to periodically right
All data blocks in node merge sequence, to keep the succession of data.
Wherein, the conversion of data procession is specifically as follows:
To each attribute point in data line record, during corresponding row file is stored after fractionation, the attribute point is
There is the information of independent attribute in data line record.
Wherein, the attribute point for being used as to sort in a line record is called " Sorted list ", and the attribute as sequence is at this
Specify during ground index creation, except " Sorted list ", other attribute points are called " non-Sorted list " in a line record." Sorted list "
" organizational form of the non-Sorted list when disk is write is different.
Sorted list is stored in order in disk.For quick location data position, the rope for quoting auxiliary is needed
Quotation part, it is possible to use B+tree.Store the value of starting elemental in certain memory block in the index of Sorted list, sequence number, and
Skew hereof, when being ranked up the reading of column element occurrence, in memory block from the beginning of the deviant of starting elemental, according to
It is secondary from file read sequence column element, until reading the sequence column element to be inquired about till, the sequence number of initial value is added
The Sorted list element number skipped during reading, obtains needing the global sequence number of the sequence column element of reading, is designated as k, the overall situation
Sequence number is used to indicate to need position of the reads data log in certain memory block.
Storage order without fixation of the non-Sorted list in disk, equally using B+tree.In the index of non-Sorted list
In store the deviation post hereof of starting elemental in certain memory block and sequence number, carry out non-sequence column element occurrence and read
When, after the global sequence number for needing to read sequence column element is determined, from the beginning of the deviation post of starting elemental, successively from file
It is middle to read non-sequence column element, until reading k-th non-sequence column element.
So read by sequence column element occurrence and non-sequence column element corresponding with the sequence column element read,
Complete to treat the complete read work for reading data
Step 410:Read to continue from increment block and fetch data.
In this step, it is determined that continue fetch data it is described reading replica node increment block in update the data when, directly
In connecing the reading reading replica node increment block, the updating the data of fetching data of continuing.
Step 411:Read to continue in the memory block indicated from the local index and fetch data.
In this step, it is determined that continue fetch data it is described reading replica node increment block kind do not update the data when,
Read to continue in the memory block determined from step 408 and fetch data.
In the present embodiment, first data read operation request is received by metadata node, judge whether have in access request
Major key and other filterconditions, in the case of having major key, can determine according to major key and global index and continue
Fetch data the data interval at place, without in the case of major key, access request is sent to all of data interval;Visiting
Ask in the case of there are other filterconditions in request, can determine a local index most matched with described other filterconditions
For filtering, without in the case of other filterconditions in access request, an arbitrarily selected local index is used to filter.Really
Set continue fetch data place data interval and memory block after, by access request be sent to determine data interval it is corresponding
Replica node is read, data read operation is carried out by the reading replica node, be that row are inquired about and inquired about to occurrence in access request
In the case of for index column, quickly positioning continue the memory block at place of fetching data can also to utilize the filter in local index.
The data record read in replica node is stored using row storage form, optimizes the reading performance of data base, because
This can realize the quick reading of data using the method for reading data embodiment of the present invention.
Embodiment three
A kind of read-write system of data base, can be found in Fig. 1, and Fig. 1 is system structure diagram disclosed in the embodiment of the present invention.
Record data is multiple data segments according to major key transversally cutting by metadata node, and each data segment saves as portion and writes
Enter copy and corresponding many parts of readings copy, wherein write copy is stored using row storage form, read copy using row
Storage form is stored, and data are organized in different ways in each reading copy, and the system can include:
Metadata node, is data write behaviour in access request for making requests on judgement to the access that client is initiated
In the case of work:
The data that data to be written are located are determined according to the major key in global index and access request that itself is preserved
Write replica node interval and corresponding with the data interval;
The write replica node of write is needed to initiate operation requests to data to be written;
In the case where access request is data read operation:
Judge whether there is keyword in access request, if major key, then according to the global index for itself preserving and
The major key determines the corresponding reading replica node of data interval and the data interval at place of fetching data that continues;
If without major key, access request is sent to current meta data node into all of data interval;
It is determined that the memory block and reading replica node corresponding with the memory block being likely located at of fetching data that continue, and
The access request is sent to into each and reads replica node;
Write replica node, writes for after the operation requests for receiving metadata node initiation, will update the data to add
In entering its increment block;
Replica node is read, in the case where there are other filterconditions, determining one with other filterconditions most
The local index matched somebody with somebody, in the case of without other filterconditions, then an arbitrarily selected local index, determines described to be read
The memory block that data are likely located at, judges with the presence or absence of the updating the data of fetching data of continuing in its increment block, if it is, just
Read to continue from increment block and fetch data, if it is not, then obtain to continue in the memory block indicated from the local index fetching data.
In actual applications, during the foundation of the global index for preserving in the metadata node, the metadata
Node can be used for:
All of write copy is sampled, and the data that sampling is obtained are ranked up according to major key;
To the data demarcation interval after sequence, and distribute corresponding start node for the data interval after division, formed just
Beginning global index;
The initial global index is sent to into each write replica node, so that described each write replica node foundation
Corresponding relation distributed data between the data interval and start node;
The data distribution result that each write replica node is returned is received, and data are divided according to the data distribution result
Block, formulates the plan of data interval balance dispatching and is sent to each write replica node;
Receive what each write replica node sent, the result of the sequencing and scheduling carried out in units of memory block is set up
Relation between the major key and the data interval and write replica node corresponding with the data interval;
The foundation of the global index for preserving in the metadata node simultaneously during, said write replica node also may be used
For:
The initial global index sent according to metadata node is carrying out data distribution;
The data interval balance dispatching formulated according to metadata node is planned to be ranked up in node and is entered between node
Row scheduling.
In actual applications, during the foundation of local index, the metadata node can be also used for:
Distribute corresponding reading replica node for the data interval in the global index;
The above-mentioned relations of distribution are sent to into each write replica node;
Trigger each reading replica node to be ranked up the data in reading copy according to the Sorted list specified;
During the foundation of local index, said write replica node can be also used for:
After receiving the relations of distribution that the metadata node is sent, the data record for itself storing is sent to corresponding
Read replica node storage;
During the foundation of local index, the reading replica node can be also used for:
The data after sequence are preserved in units of memory block, keyword is set up with the memory block and corresponding with the memory block
Reading replica node between relation.
In other examples, the reading replica node can be also used for:
Create filter.
In other embodiments, said write replica node can be also used for:
Described updating the data is sent to into corresponding reading replica node;
The reading replica node can be also used for:
Batch write increment block is updated the data by described.
In other embodiments, the reading replica node can be also used for:
When access request is that occurrence is inquired about and inquiry is classified as keyword, specified memory block is carried out using filter
Filter.
Data read-write system disclosed in the present embodiment, can optimize the readwrite performance of data base, can realize the fast of data
Literary sketch enters and quickly reads.
The step of method described with reference to the embodiments described herein or algorithm, directly can be held with hardware, processor
Capable software module, or the combination of the two is implementing.Software module can be placed in random access memory (RAM), internal memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, depositor, hard disk, moveable magnetic disc, CD-ROM or technology
In field in known any other form of storage medium.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention.
Various modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, the present invention
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The most wide scope for causing.
Claims (12)
1. a kind of reading/writing method of data base, it is characterised in that by record data according to major key transversally cutting be many numbers
According to section, each data segment saves as a write copy and corresponding many parts of readings copy, wherein write copy is deposited using row
Storage form is stored, and is read copy and is stored using row storage form, each to read in copy data in different ways
Tissue, the method includes:
Receive the access request that client is initiated;
In the case where access request is data write operation:
Major key in the global index and access request preserved in metadata node determines that data to be written need write
Data interval and write replica node corresponding with the data interval, the global index be used for indicate major key and number
According to the corresponding relation between write replica node interval and corresponding with the data interval;
The write replica node of write is needed to initiate operation requests to data to be written, said write replica node will be updated the data
Add and write in its increment block, the increment block is the disk file that record is updated the data, described to update the data as preset bar number
The set of interior data to be written;
In the case where access request is data read operation:
Judge whether there is major key in access request, if major key, then according to the overall situation preserved in metadata node
Index and the major key determine continue fetch data place data interval and the data interval it is corresponding read copy section
Point, and in the case where there is other filterconditions, determine that one most matches with other filterconditions by the reading replica node
Local index, without in the case of other filterconditions, then an arbitrarily selected local index, the local index is used for
The corresponding relation of key word and memory block is indicated, the memory block is the least unit block of data storage;
If without major key, access request is sent to current meta data node into all of data interval, and having
In the case of other filterconditions, one is determined with other filtrations by the corresponding replica node that reads of all of data interval
The local index that condition is most matched, without in the case of other filterconditions, then an arbitrarily selected local index;
It is determined that the memory block and reading replica node corresponding with the memory block being likely located at of fetching data that continue, and by institute
State access request and be sent to each reading replica node;
The replica node that reads is judged with the presence or absence of the updating the data of fetching data of continuing in its increment block, if it is, just
Read to continue from increment block and fetch data, if it is not, then read to continue in the memory block indicated from the local index fetch data,
The increment block is the disk file that record is updated the data.
2. method according to claim 1, it is characterised in that the method for building up of the global index includes:
All of write copy is sampled, and the data that sampling is obtained are ranked up according to major key;
To the data demarcation interval after sequence, and distribute corresponding start node for the data interval after division, form initial complete
Office's index;
The initial global index is sent to into each write replica node, so that described each write replica node is according to described
Corresponding relation distributed data between data interval and start node;
Metadata node receives the data distribution result that each write replica node is returned, and draws according to the data distribution result
Divide memory block, formulate the plan of data interval balance dispatching and be simultaneously sent to each write replica node, notify each write copy section
Point is ranked up according to the data interval balance dispatching plan, and the memory block is the least unit block of data storage;
Each write replica node start node internal sort;
Receive what each write replica node sent, the result of the sequencing and scheduling carried out in units of memory block sets up described
Relation between major key and the data interval and write replica node corresponding with the data interval.
3. method according to claim 1, it is characterised in that the method for building up of the local index includes:
It is the corresponding reading copy section of data interval distribution in the global index in the case where global index has built up
Point;
The above-mentioned relations of distribution are sent to into each write replica node, the data record that said write replica node stores itself
Send to corresponding and read replica node storage;
Trigger each reading replica node to be ranked up the data in reading copy according to the Sorted list specified;
The data after sequence are preserved in units of memory block, keyword is set up with the memory block and reading corresponding with the memory block
Take the relation between replica node.
4. method according to claim 3, it is characterised in that methods described also includes creating the operation of filter, described
Whether filter continues to fetch data for judgement and is located in certain memory block.
5. method according to claim 1, it is characterised in that methods described also includes:
Described updating the data is sent to corresponding reading replica node by write replica node;
The reading replica node updates the data batch write increment block by described.
6. method according to claim 1, it is characterised in that obtain in the memory block indicated from the local index
Continue and fetch data, specifically also include:
When access request is that occurrence is inquired about and inquiry is classified as keyword, reads replica node application filter and specified is deposited
Storage block is filtered.
7. a kind of read-write system of data base, it is characterised in that by record data according to major key transversally cutting be many numbers
According to section, each data segment saves as a write copy and corresponding many parts of readings copy, wherein write copy is deposited using row
Storage form is stored, and is read copy and is stored using row storage form, each to read in copy data in different ways
Tissue, the system includes:
Metadata node, is data write operation in access request for making requests on judgement to the access that client is initiated
In the case of:
The data interval that data to be written are located is determined according to the major key in global index and access request that itself is preserved
And write replica node corresponding with the data interval;
The write replica node of write is needed to initiate operation requests to data to be written;
In the case where access request is data read operation:
Judge whether there is keyword in access request, if major key, then according to the global index for itself preserving and described
Major key determines the corresponding reading replica node of data interval and the data interval at place of fetching data that continues;
If without major key, access request is sent to current meta data node into all of data interval;
It is determined that the memory block and reading replica node corresponding with the memory block being likely located at of fetching data that continue, and by institute
State access request and be sent to each reading replica node;
Write replica node, for after the operation requests for receiving metadata node initiation, will update the data to add it is write
In increment block;
Replica node is read, in the case where there are other filterconditions, determining that one most matches with other filterconditions
Local index, without in the case of other filterconditions, then an arbitrarily selected local index, it is determined that described continuing is fetched data
The memory block being likely located at, judges with the presence or absence of the updating the data of fetching data of continuing in its increment block, if it is, just from increasing
Read to continue in gauge block and fetch data, if it is not, then obtain to continue in the memory block indicated from the local index fetching data.
8. system according to claim 7, it is characterised in that the metadata node is additionally operable to:
All of write copy is sampled, and the data that sampling is obtained are ranked up according to major key;
To the data demarcation interval after sequence, and distribute corresponding start node for the data interval after division, form initial complete
Office's index;
The initial global index is sent to into each write replica node, so that described each write replica node is according to described
Corresponding relation distributed data between data interval and start node;
The data distribution result that each write replica node is returned is received, and data block is divided according to the data distribution result,
Formulate the plan of data interval balance dispatching and be sent to each write replica node;
Receive what each write replica node sent, the result of the sequencing and scheduling carried out in units of memory block sets up described
Relation between major key and the data interval and write replica node corresponding with the data interval;
Said write replica node is additionally operable to:
The initial global index sent according to metadata node is carrying out data distribution;
The data interval balance dispatching formulated according to metadata node is planned to be ranked up in node and is adjusted between node
Degree.
9. system according to claim 7, it is characterised in that the metadata node is additionally operable to:
Distribute corresponding reading replica node for the data interval in the global index;
The above-mentioned relations of distribution are sent to into each write replica node;
Trigger each reading replica node to be ranked up the data in reading copy according to the Sorted list specified;
Said write replica node is additionally operable to:
After receiving the relations of distribution that the metadata node is sent, the data record for itself storing is sent to corresponding reading
Replica node is stored;
The reading replica node is additionally operable to:
The data after sequence are preserved in units of memory block, keyword is set up with the memory block and reading corresponding with the memory block
Take the relation between replica node.
10. system according to claim 7, it is characterised in that the reading replica node is additionally operable to:
Create filter.
11. systems according to claim 7, it is characterised in that said write replica node is additionally operable to:
Described updating the data is sent to into corresponding reading replica node;
The reading replica node is additionally operable to:
Batch write increment block is updated the data by described.
12. systems according to claim 7, it is characterised in that the reading replica node is additionally operable to:
When access request is that occurrence is inquired about and inquiry is classified as keyword, specified memory block was carried out using filter
Filter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110273321.5A CN102999519B (en) | 2011-09-15 | 2011-09-15 | Read-write method and system for database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110273321.5A CN102999519B (en) | 2011-09-15 | 2011-09-15 | Read-write method and system for database |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102999519A CN102999519A (en) | 2013-03-27 |
CN102999519B true CN102999519B (en) | 2017-05-17 |
Family
ID=47928093
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110273321.5A Expired - Fee Related CN102999519B (en) | 2011-09-15 | 2011-09-15 | Read-write method and system for database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102999519B (en) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103345518B (en) * | 2013-07-11 | 2016-08-10 | 清华大学 | Self-adapting data memory management method based on data block and system |
US10311154B2 (en) | 2013-09-21 | 2019-06-04 | Oracle International Corporation | Combined row and columnar storage for in-memory databases for OLTP and analytics workloads |
CN103745008B (en) * | 2014-01-28 | 2016-08-31 | 河海大学 | A kind of sort method of big data directory |
CN105718484A (en) * | 2014-12-04 | 2016-06-29 | 中兴通讯股份有限公司 | File writing method, file reading method, file deletion method, file query method and client |
CN105740295B (en) * | 2014-12-12 | 2019-06-14 | 中国移动通信集团公司 | A kind of processing method and processing device of distributed data |
CN104598652B (en) * | 2015-02-14 | 2017-11-24 | 广州华多网络科技有限公司 | A kind of data base query method and device |
US11403318B2 (en) * | 2015-10-01 | 2022-08-02 | Futurewei Technologies, Inc. | Apparatus and method for managing storage of a primary database and a replica database |
CN107368490A (en) * | 2016-05-12 | 2017-11-21 | 中国移动通信集团河北有限公司 | Data processing method and device |
US10719446B2 (en) | 2017-08-31 | 2020-07-21 | Oracle International Corporation | Directly mapped buffer cache on non-volatile memory |
US11675761B2 (en) | 2017-09-30 | 2023-06-13 | Oracle International Corporation | Performing in-memory columnar analytic queries on externally resident data |
US11061924B2 (en) * | 2017-11-22 | 2021-07-13 | Amazon Technologies, Inc. | Multi-region, multi-master replication of database tables |
CN110765125B (en) * | 2018-07-25 | 2022-09-20 | 杭州海康威视数字技术股份有限公司 | Method and device for storing data |
CN109325031B (en) * | 2018-09-13 | 2021-08-03 | 上海达梦数据库有限公司 | Data statistical method, device, equipment and storage medium |
US11170002B2 (en) | 2018-10-19 | 2021-11-09 | Oracle International Corporation | Integrating Kafka data-in-motion with data-at-rest tables |
CN109783571B (en) * | 2018-12-13 | 2023-10-27 | 平安科技(深圳)有限公司 | Data processing method, device, computer equipment and storage medium for isolated environment |
CN110162563B (en) * | 2019-05-28 | 2023-11-17 | 深圳市网心科技有限公司 | Data warehousing method and system, electronic equipment and storage medium |
WO2022257685A1 (en) * | 2021-06-07 | 2022-12-15 | 华为技术有限公司 | Storage system, network interface card, processor, and data access method, apparatus, and system |
CN114064588B (en) * | 2021-11-24 | 2023-04-25 | 建信金融科技有限责任公司 | Storage space scheduling method and system |
CN114238362A (en) * | 2022-03-01 | 2022-03-25 | 广州观必达数据技术有限责任公司 | Water conservancy data management system |
CN115438114B (en) * | 2022-11-09 | 2023-03-24 | 浪潮电子信息产业股份有限公司 | Storage format conversion method, system, device, electronic equipment and storage medium |
CN115544321B (en) * | 2022-11-28 | 2023-03-21 | 厦门渊亭信息科技有限公司 | Method and device for realizing graph database storage and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1965316A (en) * | 2004-04-09 | 2007-05-16 | 甲骨文国际公司 | Index for accessing XML data |
CN101496005A (en) * | 2005-12-29 | 2009-07-29 | 亚马逊科技公司 | Distributed replica storage system with web services interface |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
US7761460B1 (en) * | 2004-02-04 | 2010-07-20 | Rockwell Automation Technologies, Inc. | Systems and methods that utilize a standard database interface to access data within an industrial device |
CN101996250A (en) * | 2010-11-15 | 2011-03-30 | 中国科学院计算技术研究所 | Hadoop-based mass stream data storage and query method and system |
-
2011
- 2011-09-15 CN CN201110273321.5A patent/CN102999519B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7761460B1 (en) * | 2004-02-04 | 2010-07-20 | Rockwell Automation Technologies, Inc. | Systems and methods that utilize a standard database interface to access data within an industrial device |
CN1965316A (en) * | 2004-04-09 | 2007-05-16 | 甲骨文国际公司 | Index for accessing XML data |
CN101496005A (en) * | 2005-12-29 | 2009-07-29 | 亚马逊科技公司 | Distributed replica storage system with web services interface |
CN101727465A (en) * | 2008-11-03 | 2010-06-09 | 中国移动通信集团公司 | Methods for establishing and inquiring index of distributed column storage database, device and system thereof |
CN101996250A (en) * | 2010-11-15 | 2011-03-30 | 中国科学院计算技术研究所 | Hadoop-based mass stream data storage and query method and system |
Also Published As
Publication number | Publication date |
---|---|
CN102999519A (en) | 2013-03-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102999519B (en) | Read-write method and system for database | |
CN103020204B (en) | A kind of method and its system carrying out multi-dimensional interval query to distributed sequence list | |
CN102521405B (en) | Massive structured data storage and query methods and systems supporting high-speed loading | |
CN105528367B (en) | Storage and near real-time querying method based on open source big data to time sensitive data | |
CN102521406B (en) | Distributed query method and system for complex task of querying massive structured data | |
CN107423422B (en) | Spatial data distributed storage and search method and system based on grid | |
CN103154935B (en) | For inquiring about the system and method for data stream | |
CN102646130B (en) | Method for storing and indexing mass historical data | |
US7886124B2 (en) | Method and mechanism for implementing dynamic space management for large objects | |
CN108694195B (en) | Management method and system of distributed data warehouse | |
US8301588B2 (en) | Data storage for file updates | |
JP5233233B2 (en) | Information search system, information search index registration device, information search method and program | |
CN110162528A (en) | Magnanimity big data search method and system | |
CN102016789A (en) | Data processing apparatus and method of processing data | |
CN109284069A (en) | A kind of distributed memory system and method for storing Backup Data | |
CN102819586B (en) | A kind of URL sorting technique based on high-speed cache and equipment | |
CN101452487B (en) | Data loading method and system, and data loading unit | |
CN103176754A (en) | Reading and storing method for massive amounts of small files | |
CN105956123A (en) | Local updating software-based data processing method and apparatus | |
CN104239377A (en) | Platform-crossing data retrieval method and device | |
CN106951375A (en) | The method and device of snapped volume is deleted within the storage system | |
CN107209768A (en) | Method and apparatus for the expansible sequence of data set | |
CN106991190A (en) | A kind of database automatically creates subdata base system | |
CN107491495A (en) | Storage method of the preferential space-time trajectory data file of space attribute in auxiliary storage device | |
CN105512325B (en) | Update, deletion and the method for building up and device of multi-edition data index |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170517 Termination date: 20180915 |