CN103488778B - A kind of data query method and device - Google Patents

A kind of data query method and device Download PDF

Info

Publication number
CN103488778B
CN103488778B CN201310459279.5A CN201310459279A CN103488778B CN 103488778 B CN103488778 B CN 103488778B CN 201310459279 A CN201310459279 A CN 201310459279A CN 103488778 B CN103488778 B CN 103488778B
Authority
CN
China
Prior art keywords
node
hashed value
slicing files
data
tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310459279.5A
Other languages
Chinese (zh)
Other versions
CN103488778A (en
Inventor
李烨
陈浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201310459279.5A priority Critical patent/CN103488778B/en
Publication of CN103488778A publication Critical patent/CN103488778A/en
Application granted granted Critical
Publication of CN103488778B publication Critical patent/CN103488778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention discloses a kind of data query method and device, including: receive data inquiry request;Determine and need to carry out two tables of data that hash splits;The information of tables of data is sent to each from node;Receive the hashed value and the corresponding relation of table slicing files sent from node;Hashed value is carried out segmentation and obtains hashed value section, determine each hashed value section corresponding from node;According to hashed value and the corresponding relation of table slicing files, control each and carry out table slicing files migration from node;Receive each connection result corresponding to each hashed value sent from node, and determine the result of described data inquiry request according to connection result.The present invention can obtain correct left outside connection or right outer connection result when MPP distributed data base system uses left outside connection or right outer connection realizes data query process, and then obtain correct data query result.

Description

A kind of data query method and device
Technical field
The present invention relates to memory technology, particularly relate to a kind of data query method and device.
Background technology
At present, generally use data base and carry out the storage of data.When user carries out data query by client, client Holding key word of the inquiry user inputted to send to data base, database root carries out data query according to described key word of the inquiry, Query Result is fed back to client, then by client, Query Result is shown to user.
In current normally used uniprocessor version data base, commonly used left outside connection during data query, right outer connection etc. Reason method, and then based on the result of described left outside connection, right outer connection etc., determine the result of data query. Wherein, the process connected outside left outside connection and the right side, to liking two tables, is called left-handed watch and right table, by left outside connection Or connection may determine that the annexation between two tables outside right.Wherein, the result set of left outside connection comprises: with right table A line meets all row in the left-handed watch of condition of contact, and provisional capital any one with right table is unsatisfactory for the institute in the left-handed watch of condition of contact There is row.For left-handed watch is unsatisfactory for all row of condition of contact, fill the attribute from right table by null value.And connect outside right Concept be connected symmetrical with left outside, the right outer result set connected comprises: with the right table that left-handed watch a line meets condition of contact In all row, provisional capital any one with left-handed watch is unsatisfactory for all row in the right table of condition of contact.For right table is unsatisfactory for The row of condition of contact, fills the attribute from left-handed watch by null value.
The traditional method that left-handed watch and right table carry out in uniprocessor version data base left outside connection is: according to connection attribute to left-handed watch and All data of right table carry out hash and split, and then, left-handed watch data corresponding to each hashed value are carried out respectively with right table data Left outside connection, is finally merged together the result of the left outside connection of each hashed value.
So-called hash (Hash) refers to, certain functional transformation that enters through of any range is become the output of fixed range, Or, certain functional transformation that enters through of random length is become the output of regular length.Described output is exactly hashed value, Described function is exactly hash function.Hash is that a kind of compression maps, the space of the space of hashed value generally much less than input, Different inputs is likely to be obtained identical output through hash.In other words, hashed value is identical is that the identical necessity of input is non- Sufficient condition.Hash splits and refers to inputs different for hashed value separately, input identical for hashed value be gathered.
But, along with the development of database technology, based on MPP (MPP, Massive parallel Processing) and without sharing the MPP distributed data base of (shared-nothing) framework it is applied to more and more widely The storage of mass data and inquiry.MPP distributed data base compared to uniprocessor version data base, by multiple nodes also Row performs the database tasks such as data importing, data query, improves performance and the availability of data base.
When using above-mentioned MPP distributed data base to carry out data query, typically by host node, individual data inquiry is appointed Business splits into and then all small datas can be inquired about at multiple multiple small data query tasks concurrently performed from node The Query Result of task collects and obtains final Query Result.
But, if the data query of MPP distributed data base needing use above-mentioned left outside connection or right outer connection, Such as need to use left outside connection, then, when data base receives the data inquiry request of user, host node is by described Data inquiry request splits into multiple small data query task including left outside connection, each from node respectively according to described little Data query task carries out left outside connection, and then each be will be unable to obtain by host node from the left outside connection result merging of node Correct left outside connection result, and then also cannot obtain correct data query result.
Summary of the invention
The embodiment of the present invention provides a kind of data query method and device, it is possible to use in MPP distributed data base system When left outside connection or right outer connection realize data query process, obtain correct left outside connection or right outer connection result, and then obtain Correct data query result.
First aspect, it is provided that a kind of data query method, including:
Receive data inquiry request;
Determine according to described data inquiry request and need to carry out two tables of data that hash splits;
The information of described tables of data is sent to each from node;
Receive each hashed value sent from node and the corresponding relation of table slicing files, described hashed value and table slicing files Corresponding relation be to be split obtained by the described hash that tables of data indicated by the information of described tables of data carried out from node;
Described hashed value is carried out segmentation and obtains hashed value section, determine each hashed value section corresponding from node;
According to described hashed value and the corresponding relation of table slicing files, control each and carry out table slicing files migration from node, The table slicing files making each hashed value section corresponding migrate to each hashed value section described corresponding from node, in order to each Individual from node, the table slicing files of a tables of data said two tables of data is merged according to hashed value, according to hash Value be combined after table slicing files and said two tables of data in the table slicing files of another tables of data carry out connecting outside first Connect and obtain the connection result that each hashed value is corresponding;
Receive each connection result corresponding to each hashed value sent from node, and determine according to described connection result described The result of data inquiry request.
In conjunction with first aspect, in the first possible implementation of first aspect, also include:
Determine according to described data inquiry request and split foundation and the first outer literary name section connecting foundation as hash;
The information of described literary name section is sent to each from node.
In conjunction with first aspect, and/or the first possible implementation of first aspect, in the reality that first aspect the second is possible In existing mode, determine that according to described connection result the result of described data inquiry request includes:
Connection result corresponding for each hashed value is merged, obtains the first outer connection result;
The result of data inquiry request is determined according to described first outer connection result.
In conjunction with first aspect, and/or the first possible implementation of first aspect, and/or first aspect the second may Implementation, in the third possible implementation of first aspect, control each and carry out table slicing files from node and move Shifting includes:
Sending the first migration order to each from node, described first migrates order includes: from the hashed value that node is corresponding In Duan, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;Described first migrates Order obtains described table slicing files from node from described table slicing files place from node for instruction.
In conjunction with first aspect, and/or the first possible implementation of first aspect, and/or first aspect the second may Implementation, in the 4th kind of possible implementation of first aspect, control each and carry out table slicing files from node and move Shifting includes:
Sending the second migration order to each from node, described second migrates order includes: each hashed value section is corresponding From the information of node;Described second migrates order will divide to table from this earth's surface slicing files transmission of node from node for instruction Sheet file corresponding from node;What described table slicing files was corresponding is that the hashed value place that table slicing files is corresponding dissipates from node Train value section corresponding from node.
Second aspect, it is provided that a kind of data query method, including:
Receive the information of two tables of data that host node sends;
Two tables of data of the described described information instruction local from node are carried out hash fractionation respectively, obtains hashed value pair The table slicing files answered;
The corresponding relation of hashed value and table slicing files is sent to host node;
Table slicing files migration is carried out so that table slicing files corresponding to each hashed value section moves under described host node controls Move to each hashed value section described corresponding from node;Hashed value segmentation is obtained, respectively by described hashed value section by host node Individual corresponding being determined by host node from node of hashed value section;Described control is civilian according to hashed value and table burst by described host node The corresponding relation of part is carried out;
The table slicing files of a tables of data in said two tables of data is merged according to hashed value, obtains hashed value pair The table burst answered merges file;
According to hashed value, described table burst is merged the table burst literary composition of another tables of data in file and said two tables of data Part carries out the first outer connection, obtains the connection result that each hashed value is corresponding;
Connection result corresponding for hashed value is sent to host node, in order to according to described connection result, host node determines that data are looked into Ask the result of request.
In conjunction with second aspect, in the first possible implementation of second aspect, also include:
Receive the information splitting the literary name section connecting foundation outside foundation and first as hash that host node sends.
In conjunction with second aspect, and/or the first possible implementation of second aspect, in the reality that second aspect the second is possible In existing mode, under described host node control, carry out the migration of table slicing files include:
Receiving the first migration order that described host node sends, described first migrates order includes: from corresponding the dissipating of node In train value section, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;
Described table slicing files is obtained from described table slicing files place from node.
In conjunction with second aspect, and/or the first possible implementation of second aspect, in the third possible reality of second aspect In existing mode, under described host node control, carry out the migration of table slicing files include:
Receiving the second migration order that described host node sends, described second migrates order includes: each hashed value section is right The information from node answered;
According to the information from node that described hashed value section is corresponding, will send to table burst literary composition from this earth's surface slicing files of node Part corresponding from node;Described table slicing files corresponding from node be the hashed value place hashed value that table slicing files is corresponding Section corresponding from node.
The third aspect, it is provided that a kind of data query arrangement, including:
First receives unit, is used for receiving data inquiry request;
First determines unit, determines that needs are carried out for receiving the described data inquiry request of unit reception according to described first Two tables of data that hash splits;
First transmitting element, for determining that by first the information of described tables of data that unit determines is sent to each from node;
Described first receives unit is additionally operable to: receive the corresponding pass of each hashed value sent from node and table slicing files System, the corresponding relation of described hashed value and table slicing files be by described from node to indicated by the information of described tables of data Tables of data carries out what hash fractionation obtained;
Segmenting unit, the described hashed value received for receiving unit to described first carries out segmentation and obtains hashed value section, Determine each hashed value section corresponding from node;
Control unit, for receiving, according to first, described hashed value and the corresponding relation of table slicing files that unit receives, Control each and carry out table slicing files migration from node so that table slicing files corresponding to each hashed value section migrates to described Each hashed value section corresponding from node, in order to the table of a tables of data said two tables of data is divided by each from node Sheet file merges according to hashed value, be combined according to hashed value after table slicing files and said two tables of data in another The table slicing files of one tables of data carries out the first outer connection obtaining the connection result that each hashed value is corresponding;
Described first receives unit is additionally operable to: receive each connection result corresponding to each hashed value sent from node;
For the described connection result received according to described first reception unit, query unit, determines that described data query please The result asked.
In conjunction with the third aspect, in the first possible implementation of the third aspect, described first determines that unit is additionally operable to: Determine to split to connect outside foundation and first as hash according to the described first described data inquiry request receiving unit reception and depend on According to literary name section;
Described first transmitting element is additionally operable to: determine that by described first the information of the described literary name section that unit determines is sent to respectively Individual from node.
In conjunction with the third aspect, and/or the first possible implementation of the third aspect, in the reality that third aspect the second is possible In existing mode, query unit specifically for: connection result corresponding for each hashed value is merged, obtains connecting outside first knot Really;The result of data inquiry request is determined according to described first outer connection result.
In conjunction with the third aspect, and/or the first possible implementation of the third aspect, and/or third aspect the second may Implementation, in the third possible implementation of the third aspect, described control unit specifically for:
Sending the first migration order to each from node, described first migrates order includes: from the hashed value that node is corresponding In Duan, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;Described first migrates Order obtains described table slicing files from node from described table slicing files place from node for instruction.
In conjunction with the third aspect, and/or the first possible implementation of the third aspect, and/or third aspect the second may Implementation, in the 4th kind of possible implementation of the third aspect, described control unit specifically for:
Sending the second migration order to each from node, described second migrates order includes: each hashed value section is corresponding From the information of node;Described second migrates order will divide to table from this earth's surface slicing files transmission of node from node for instruction Sheet file corresponding from node;What described table slicing files was corresponding is that the hashed value place that table slicing files is corresponding dissipates from node Train value section corresponding from node.
Fourth aspect, it is provided that a kind of data query arrangement, including:
Second receives unit, for receiving the information of two tables of data that host node sends;
Split cells, for two numbers to the described information instruction that the described second reception unit local from node receives Carry out hash respectively to split according to table, obtain the table slicing files that hashed value is corresponding;
Second transmitting element, the corresponding relation for the hashed value obtained by split cells and table slicing files sends to main joint Point;
Controlled unit, carries out table slicing files migration under controlling at described host node so that each hashed value section is corresponding Table slicing files migrate to each hashed value section described corresponding from node;Described hashed value section by host node to hash Value segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control by described host node according to dissipate The corresponding relation of train value and table slicing files is carried out;
Second combining unit, in the said two tables of data after being migrated by described controlled unit, the table of a tables of data divides Sheet file merges according to hashed value, obtains table burst corresponding to hashed value and merges file;
Connect unit, merge file and said two for the described table burst the second combining unit obtained according to hashed value In tables of data, the table slicing files of another tables of data carries out the first outer connection, obtains the connection result that each hashed value is corresponding;
Described second transmitting element is additionally operable to: connection result corresponding to the hashed value that obtained by described connection unit is sent to main Node, in order to host node determines the result of data inquiry request according to described connection result.
In conjunction with fourth aspect, in the first possible implementation of fourth aspect, described second receives unit is additionally operable to: Receive the information splitting the literary name section connecting foundation outside foundation and first as hash that host node sends.
In conjunction with fourth aspect, and/or the first possible implementation of fourth aspect, in the reality that fourth aspect the second is possible In existing mode, described controlled unit specifically for:
Receiving the first migration order that described host node sends, described first migrates order includes: from corresponding the dissipating of node In train value section, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;
Described table slicing files is obtained from described table slicing files place from node.
In conjunction with fourth aspect, and/or the first possible implementation of fourth aspect, in the third possible reality of fourth aspect In existing mode, described controlled unit specifically for:
Receiving the second migration order that described host node sends, described second migrates order includes: each hashed value section is right The information from node answered;
According to the information from node that described hashed value section is corresponding, will send to table burst literary composition from this earth's surface slicing files of node Part corresponding from node;Described table slicing files corresponding from node be the hashed value place hashed value that table slicing files is corresponding Section corresponding from node.
In the present embodiment, receive data inquiry request;Determine according to described data inquiry request and need to carry out what hash split Two tables of data;The information of described tables of data is sent to each from node;Receive each from node send hashed value and The corresponding relation of table slicing files, the corresponding relation of described hashed value and table slicing files be by described from node to described number Carry out what hash fractionation obtained according to the tables of data indicated by the information of table;Described hashed value is carried out segmentation and obtains hashed value section, Determine each hashed value section corresponding from node;According to described hashed value and the corresponding relation of table slicing files, control each Table slicing files migration is carried out so that table slicing files corresponding to each hashed value section migrates to each hash described from node Value section corresponding from node, in order to the table slicing files of a tables of data said two tables of data is pressed by each from node Merge according to hashed value, be combined according to hashed value after table slicing files and said two tables of data in another data The table slicing files of table carries out the first outer connection obtaining the connection result that each hashed value is corresponding;Receive each to send from node Connection result corresponding to each hashed value, and determine the result of described data inquiry request according to described connection result.From And achieve and under MPP distributed data base, use left outside connection or right outer connection to carry out data query when processing, it is possible to To correct left outside connection or right outer connection result, and then just can obtain according to described left outside connection or right outer connection result True data query result.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, will make required in embodiment below Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings Accompanying drawing.
Fig. 1 is the networking structure example of MPP distributed data base;
Fig. 2 is data query method first embodiment schematic diagram of the present invention;
Fig. 3 is data query method of the present invention second embodiment schematic diagram;
Fig. 4 is data query method the 3rd embodiment schematic diagram of the present invention;
Fig. 5 is data query arrangement first embodiment schematic diagram of the present invention;
Fig. 6 is data query arrangement of the present invention second embodiment schematic diagram;
Fig. 7 is host node structural representation of the present invention;
Fig. 8 is that the present invention is from node structure schematic diagram.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Description, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments.Base Embodiment in the present invention, it is all that those of ordinary skill in the art are obtained under not paying creative work premise Other embodiments, broadly fall into the scope of protection of the invention.
See Fig. 1, for the networking structure example of MPP distributed data base, wherein, the client 110 at client place Being connected with host node 120, host node 120 is connected from node 130 with at least 1 by Ethernet, and from node 130 Between also by Ethernet connect.Wherein, the node in MPP distributed data base, such as host node, from node etc., Can be a physical equipment such as personal computer (PC) etc., it is also possible to be multiple physical equipment such as PC composition LAN etc..
Seeing Fig. 2, for data query method first embodiment schematic diagram of the present invention, the method includes:
Step 201: host node receives data inquiry request;
Step 202: host node determines according to described data inquiry request to be needed to carry out two tables of data that hash splits;
Step 203: the information of described tables of data is sent to each from node by host node;
Step 204: host node receives each hashed value sent from node and corresponding relation of table slicing files, described scattered The corresponding relation of train value and table slicing files is to be carried out the tables of data indicated by the information of described tables of data from node by described Hash fractionation obtains;
Step 205: host node carries out segmentation to described hashed value and obtains hashed value section, determines that each hashed value section is corresponding From node;
Step 206: host node, according to described hashed value and the corresponding relation of table slicing files, controls each and carries out from node Table slicing files migrates so that it is corresponding that table slicing files corresponding to each hashed value section migrates to each hashed value section described From node, in order to the table slicing files of a tables of data said two tables of data is entered according to hashed value by each from node Row merges, be combined according to hashed value after table slicing files and said two tables of data in the table burst of another tables of data File carries out the first outer connection obtaining the connection result that each hashed value is corresponding;
Step 207: host node receives each connection result corresponding to each hashed value sent from node, and according to described Connection result determines the result of described data inquiry request.
In the present embodiment, host node, according to described hashed value and the corresponding relation of table slicing files, controls each and enters from node Row table slicing files migrate, afterwards, by from node by the table slicing files of a tables of data in said two tables of data according to Hashed value merges, be combined according to hashed value after table slicing files and said two tables of data in another tables of data Table slicing files carry out the first outer connection obtaining the connection result that each hashed value is corresponding, host node receives each from node Connection result corresponding to each hashed value sent, and the result of described data inquiry request is determined according to described connection result. During it is achieved thereby that use left outside connection or right outer connection to carry out data query process under MPP distributed data base, it is possible to Obtain correct left outside connection or right outer connection result, and then can obtain according to described left outside connection or right outer connection result Correct data query result.
Seeing Fig. 3, for data query method of the present invention second embodiment schematic diagram, the method includes:
Step 301: receive the information of two tables of data that host node sends from node;
Step 302: from node, two tables of data of the described described information instruction local from node are carried out hash respectively and tear open Point, obtain the table slicing files that hashed value is corresponding;
Step 303: the corresponding relation of hashed value and table slicing files is sent to host node from node;
Step 304: carry out table slicing files migration from node described host node controls so that each hashed value section is right The table slicing files answered migrate to each hashed value section described corresponding from node;Described hashed value section by host node to dissipate Train value segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control by described host node according to The corresponding relation of hashed value and table slicing files is carried out;
Step 305: the table slicing files of a tables of data said two tables of data is closed according to hashed value from node And, obtain table burst corresponding to hashed value and merge file;
Step 306: according to hashed value, described table burst is merged another number file and said two tables of data from node Carry out the first outer connection according to the table slicing files of table, obtain the connection result that each hashed value is corresponding;
Step 307: connection result corresponding for hashed value is sent to host node from node, in order to host node is according to described company Access node fruit determines the result of data inquiry request.
In the present embodiment, described host node controls, carry out table slicing files migration from node so that each hashed value section Corresponding table slicing files migrate to each hashed value section described corresponding from node;Described hashed value section is by host node pair Hashed value segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control is by described host node root Carry out according to the corresponding relation of hashed value and table slicing files;By the table slicing files of a tables of data in said two tables of data Merge according to hashed value, obtain table burst corresponding to hashed value and merge file;According to hashed value, described table burst is closed And in file and said two tables of data, the table slicing files of another tables of data carries out first and outer connects, and obtains each hash The connection result that value is corresponding;Connection result corresponding for hashed value is sent to host node, in order to host node is according to described connection Result determines the result of data inquiry request.Thus coordinate with host node from node and achieve MPP distributed data base When using the outer connection of left outside connection or the right side to carry out data query process, it is possible to obtain connecting knot outside correct left outside connection or the right side Really, and then correct data query result can be obtained.
Seeing Fig. 4, for data query method the 3rd embodiment schematic diagram of the present invention, the method includes:
Step 401: client receives the key word of the inquiry of user's input, generates inquiry request and sends to host node.
In actual applications, client can provide the user the interface of input inquiry keyword, each for listing in interface Plant literary name name section, after each literary name name section, input frame be set, user in input frame, input concrete field contents, Client is using crucial as described inquiry to the literary name name section of user's input field content and the field contents of user's input Word.
Step 402: host node determines that according to described data inquiry request carrying out first connects outward, furthermore, it is desirable to dissipate Row split the first tables of data and the second tables of data, as fractionation foundation literary name section.
The data inquiry request that client is sended over by host node resolves, and may determine that according to the result resolved and is looking into Inquiry is use left outside connection or right outer connection;Further, may determine that carry out left outside connection or the right side according to the result resolved The tables of data of outer connection, namely described first tables of data and the second tables of data;Further, can also be really according to the result resolved It is set for as splitting the information such as foundation and the first outer literary name section connecting foundation.
Wherein, how data inquiry request is resolved and generally exist in data base management system, do not repeat.
Step 403: the information of described first tables of data, the second tables of data and literary name section is sent to each from joint by host node Point.
Wherein, the data of the first tables of data and the second tables of data be the most all stored in each from node, therefore, described letter Breath may include that the mark of the first tables of data, the table name of the such as first tables of data;The mark of the second tables of data, such as The table name of the second tables of data;Literary name segment identification, such as literary name name section;Field contents.
Step 404: for each from node, from node according to described literary name section to first tables of data local from node Carrying out hash to split, obtain the first corresponding relation, described first corresponding relation is between hashed value and the first table slicing files Corresponding relation;Further, according to described literary name section, second tables of data local from node is carried out hash from node and splits, Obtaining the second corresponding relation, described second corresponding relation is the corresponding relation between hashed value and the second table slicing files.
Wherein, described first table slicing files is the table slicing files that the first tables of data carries out that hash fractionation obtains;Second table Slicing files is the table slicing files that the second tables of data carries out that hash fractionation obtains.Wherein, table slicing files the most only comprises A part of data of tables of data.
Step 405: for each from node, from node, described first corresponding relation, described second corresponding relation are sent out Deliver to host node.
Step 406: host node carries out segmentation to described hashed value and obtains hashed value section, determines that each hashed value section is corresponding From node.
Wherein, the most how host node carries out segmentation to hashed value, does not the most limit.The length of different hashed value sections can With identical or different.
In the implementation that the first is possible, determine each hashed value section corresponding from node time, can be according to certain It is corresponding from node that preset rules is that each hashed value section is specified;Described preset rules can be to be randomly assigned, according to certain It is followed successively by each hashed value section appointment etc. from the preset order of node;Such as,
Hashed value is divided into hashed value section 1, hashed value section 2, hashed value section 3 ..., then can be hashed value section 1 at random Specify corresponding from node be from node 3, for hashed value section 2 specify correspondence from node be from node 4, for hashed value section 3 specify corresponding from node be from node 7 etc.;Or, preset from the order of node be from node 1, from node 2 ..., Be natural number from node n, n, then can according to this order for hashed value section 1 specify correspondence from node for from node 1, For hashed value section 2 specify correspondence from node for from node 2, for hashed value section 3 specify correspondence from node for from node 3 Deng.
In the implementation that the second is possible, determine each hashed value section corresponding from node time, can be according to each Described first corresponding relation sent from node and the second corresponding relation, the most most by comprising hashed value in hashed value section From node as hashed value section corresponding from node, concrete implementation method is not intended to here.For example, it is assumed that hashed value is drawn It is divided into hashed value section 1, hashed value section 2, hashed value section 3 ..., from node 1~from node n, from node m(1 < m < n) In hashed value in the hashed value section 1 that comprises most, then can be hashed value section 1 specify correspondence from node be from node m。
Corresponding merely illustrative from the method for node of each hashed value section determined above, can use other in actual applications Method determine each hashed value section corresponding from node, the present invention does not limit.
Step 407: described first corresponding relation that host node is sent from node according to each and the second corresponding relation, controls Each carries out table slicing files migration from node;Described table slicing files migrates and includes: by the hashed value institute in hashed value section The first corresponding table slicing files and the second table slicing files migrate to hashed value section corresponding from node.
In the implementation that the first is possible, host node control each from node carry out table slicing files migrate can wrap Include:
Host node sends the first migration order to each from node, and described first migrates order includes: corresponding from node In hashed value section, the first table slicing files corresponding to each hashed value and file place are from the information of node, from node pair In the hashed value section answered, the second table slicing files corresponding to each hashed value and file place are from the information of node;Described First migrates order obtains described first table slicing files and second table from node from described file place from node for instruction Slicing files.
In the implementation that the second is possible, host node control each from node carry out table slicing files migrate can wrap Include:
Host node sends the second migration order to each from node, and described second migrates order includes: each hashed value section The corresponding information from node;Described second migrates order will be from the local first table slicing files of node from node for instruction Send to the first table slicing files corresponding from node, will send to the second table burst from the local second table slicing files of node File corresponding from node;Described first table slicing files corresponding from node be the hashed value that the first table slicing files is corresponding Place hashed value section corresponding from node;What described second table slicing files was corresponding is that the second table slicing files is corresponding from node Hashed value place hashed value section corresponding from node.
Step 408: the first corresponding for local same hashed value table slicing files is merged from node from node for each, The the first table burst obtaining each hashed value corresponding merges file.
Here merging is the merging of the data in table slicing files.General, the data in table slicing files are by many row Record (being also tuple) is constituted.Such as, containing two records in the first table slicing files A, it is Wu and Wang respectively; Containing 1 record in first table slicing files B, for Zhang, then, the first table burst after merging merges in file Containing 3 records, respectively Wu, Wang and Zhang.The method merged is typically the number of a table slicing files According to being appended in another table slicing files, in the such as first table slicing files B, all records all add the first table burst to In file A, naturally it is also possible to add all records in the first table slicing files A to first table slicing files B in turn In.Generally, the first table that the data recorded in the less first table slicing files of number can be added record number more to divides In sheet file, the performance of such merging treatment is higher.The first table burst after merging merges the elder generation of each bar record in file Not rear order not affecting left outside connection or right outer connection for the embodiment of the present invention.
Step 409: host node indicates each to carry out connecting first from node;From node according to described literary name section, to respectively The first table burst merging file and the second table slicing files that individual hashed value is corresponding carry out the described first outer connection, obtain each The connection result that individual hashed value is corresponding.
Wherein, host node indicate each from node carry out first connect this process can be in step 402~step 409 Between perform, and the execution sequence between step 403~step 408 is not intended to.
Wherein, the first tables of data can be as left-handed watch, and the second tables of data can be as right table, the most described first outer connection It can be right outer connection;Or, the first tables of data can be as right table, and the second tables of data can be as left-handed watch, now institute Stating the first outer connection can be left outside connection.
Wherein, same the first table burst that each hashed value is corresponding from node merges each corresponding with this hashed value of file Second table slicing files carries out can concurrently carrying out the first outer connection;Such as, the hashed value A correspondence first from node 1 Table burst merges file X1, corresponding second table slicing files Y1, Y2, Y3, then, the first table burst merges file X1 and the second table slicing files Y1 carries out the first outer connection, the first table burst merges file X1 and the second table slicing files Y2 carries out the first outer connection, the first table burst merges file X1 and the second table slicing files Y3 and carries out connecting this outside first Three the first outer connections process and can concurrently carry out;
Further, the first table burst that on same node, different hashed values are corresponding merges each corresponding with this hashed value of file the Two table slicing files carry out can concurrently carrying out the first outer connection;Such as, from node 1, there are hashed value A, B, C, then First table burst corresponding to hashed value A merges file each second table slicing files corresponding with this hashed value to be carried out outside first Connect, the first table burst that hashed value B is corresponding merges file each second table slicing files corresponding with this hashed value and carries out First outer connect, each the second table burst that the first table burst that hashed value C is corresponding merges file corresponding with this hashed value civilian Part carries out the first outer connection outer connection of these three first and processes and can concurrently carry out;
Further, first each node carried out respectively outer connection can also concurrently be carried out;Such as, from node 1,2,3 Be required to carry out the first outer connection respectively process, then the first outer connection that these three is carried out from node processes concurrently to enter OK.
Step 410: for each from node, from node, connection result corresponding for hashed value is sent to host node.
Step 411: connection result merging corresponding for each hashed value is obtained the first outer connection result, according to institute by host node State the first outer connection result and determine data query result.
Wherein, host node the most how the first outer connection result is for according to determining that data query result does not repeats, such as may be used With the described key word of the inquiry inputted according to user, described first outer connection result is scanned for, will include that described inquiry is closed The record of key word is as Query Result etc..
Step 412: described data query result is sent to client by host node.
Step 413: described data query result is shown to user by client.
Wherein, the most how client displays to the user that described data query result, and the embodiment of the present invention does not limit.
In the present embodiment, host node, according to described first corresponding relation and the second corresponding relation, controls each and carries out from node Table slicing files migrates, and afterwards, is closed by the first table slicing files of the described same hashed value local from node by from node And obtain the first table burst merging file corresponding to hashed value, according to described literary name section first table corresponding to each hashed value Burst merges file and the second table slicing files carries out the first outer connection obtaining the connection result that each hashed value is corresponding, main Node receives each connection result corresponding to each hashed value sent from node, by connection result corresponding for each hashed value Merge, obtain data query result.It is achieved thereby that use left outside connection or right outer connection under MPP distributed data base When carrying out data query process, it is possible to obtain correct left outside connection or right outer connection result, and then can obtain correct Data query result.
Seeing Fig. 5, for data query arrangement first embodiment schematic diagram of the present invention, this device can apply to host node, This device 500 includes:
First receives unit 510, is used for receiving data inquiry request;
First determines unit 520, determines for receiving the described data inquiry request of unit 510 reception according to described first Need to carry out two tables of data that hash splits;
First transmitting element 530, for determining that by first the information of described tables of data that unit 520 determines is sent to each From node;
Described first receives unit 510 is additionally operable to: receive each hashed value sent from node and correspondence of table slicing files Relation, the corresponding relation of described hashed value and table slicing files be by described from node to indicated by the information of described tables of data Tables of data carry out hash and split and obtain;
Segmenting unit 540, the described hashed value received for receiving unit 510 to described first carries out segmentation and is dissipated Train value section, determine each hashed value section corresponding from node;
Control unit 550, for receiving the right of the described hashed value that receives of unit 510 and table slicing files according to first Should be related to, control each and carry out table slicing files migration from node so that table slicing files corresponding to each hashed value section moves Move to each hashed value section described corresponding from node, in order to each from node by data said two tables of data The table slicing files of table merges according to hashed value, be combined according to hashed value after table slicing files and said two number Carry out the first outer connection obtaining the connection result that each hashed value is corresponding according to the table slicing files of another tables of data in table;
Described first receives unit 510 is additionally operable to: receive each connection result corresponding to each hashed value sent from node;
Query unit 560, determines described number for the described connection result received according to described first reception unit 510 Result according to inquiry request.
Wherein, described first determines that unit 520 can be also used for: receive described in unit 510 reception according to described first Data inquiry request determines as the outer literary name section connecting foundation of hash fractionation foundation and first;
Described first transmitting element 530 can be also used for: determine the described literary name section that unit 520 determines by described first Information is sent to each from node.
Wherein, query unit 560 specifically may be used for: connection result corresponding for each hashed value is merged, obtains first Outer connection result;The result of data inquiry request is determined according to described first outer connection result.
Wherein, described control unit 550 specifically may be used for:
Sending the first migration order to each from node, described first migrates order includes: from the hashed value that node is corresponding In Duan, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;Described first migrates Order obtains described table slicing files from node from described table slicing files place from node for instruction.
Wherein, described control unit 550 specifically may be used for:
Sending the second migration order to each from node, described second migrates order includes: each hashed value section is corresponding From the information of node;Described second migrates order will divide to table from this earth's surface slicing files transmission of node from node for instruction Sheet file corresponding from node;What described table slicing files was corresponding is that the hashed value place that table slicing files is corresponding dissipates from node Train value section corresponding from node.
In the present embodiment, according to described hashed value and the corresponding relation of table slicing files, control each and carry out table from node and divide Sheet file migration, afterwards, by from node by the table slicing files of a tables of data in said two tables of data according to hashed value Merge, be combined according to hashed value after table slicing files and said two tables of data in the table of another tables of data divide Sheet file carries out the first outer connection obtaining the connection result that each hashed value is corresponding, and host node receives what each sent from node The connection result that each hashed value is corresponding, and the result of described data inquiry request is determined according to described connection result.Thus Achieve and under MPP distributed data base, use left outside connection or right outer connection to carry out data query when processing, it is possible to obtain Correct left outside connection or right outer connection result, and then correct data query result can be obtained.
Seeing Fig. 6, for data query arrangement of the present invention second embodiment schematic diagram, this device can be arranged at from node, This device 600 includes:
Second receives unit 610, for receiving the information of two tables of data that host node sends;
Split cells 620, for the described information instruction receiving the described second reception unit 610 local from node Two tables of data respectively carry out hash split, obtain the table slicing files that hashed value is corresponding;
Second transmitting element 630, the corresponding relation for the hashed value obtained by split cells 620 and table slicing files is sent out Deliver to host node;
Controlled unit 640, carries out table slicing files migration at described host node under controlling to table slicing files so that Table slicing files corresponding to each hashed value section migrate to each hashed value section described corresponding from node;Described hashed value Hashed value segmentation is obtained by section by host node, each corresponding being determined by host node from node of hashed value section;Described control by Described host node is carried out according to the corresponding relation of hashed value and table slicing files;
Second combining unit 650, a tables of data in the said two tables of data after controlled unit 640 is migrated Table slicing files merges according to hashed value, obtains table burst corresponding to hashed value and merges file;
Connect unit 660, the described table burst for the second combining unit 650 being obtained according to hashed value merge file and In said two tables of data, the table slicing files of another tables of data carries out the first outer connection, obtains each hashed value corresponding Connection result;
Second transmitting element 630 is additionally operable to: connection result corresponding to the hashed value that obtained by described connection unit 660 sends To host node, in order to host node determines the result of data inquiry request according to described connection result.
Wherein, described second receive unit 610 can be also used for: receive host node send conduct hash split foundation and The information of the first outer literary name section connecting foundation.
Wherein, controlled unit 640 specifically for:
Receiving the first migration order that described host node sends, described first migrates order includes: from corresponding the dissipating of node In train value section, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;
Described table slicing files is obtained from described table slicing files place from node.
Wherein, controlled unit 640 specifically for:
Receiving the second migration order that described host node sends, described second migrates order includes: each hashed value section is right The information from node answered;
According to the information from node that described hashed value section is corresponding, will send to table burst literary composition from this earth's surface slicing files of node Part corresponding from node;Described table slicing files corresponding from node be the hashed value place hashed value that table slicing files is corresponding Section corresponding from node.
In the present embodiment, under described host node controls, carry out table slicing files migration so that each hashed value section is corresponding Table slicing files migrate to each hashed value section described corresponding from node;Described hashed value section by host node to hashed value Segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control by described host node according to hash The corresponding relation of value and table slicing files is carried out;By the table slicing files of a tables of data in said two tables of data according to dissipating Train value merges, and obtains table burst corresponding to hashed value and merges file;According to hashed value, described table burst is merged file Carry out the first outer connection with the table slicing files of another tables of data in said two tables of data, obtain each hashed value corresponding Connection result;Connection result corresponding for hashed value is sent to host node, in order to host node is true according to described connection result The result of given data inquiry request.Thus achieve with host node cooperation and MPP distributed data base uses left outside connection Or when outside right, connection carries out data query process, it is possible to obtain correct left outside connection or right outer connection result, and then can Obtain correct data query result.
Seeing Fig. 7, for embodiment of the present invention host node structure chart, this host node 700 includes: processor 710, storage Device 720, transceiver 730 and bus 740;
Processor 710, memorizer 720, transceiver 730 are connected with each other by bus 740;Bus 740 can be ISA Bus, pci bus or eisa bus etc..Described bus can be divided into address bus, data/address bus, control bus etc.. For ease of representing, Fig. 7 only represents with a thick line, it is not intended that an only bus or a type of bus.
Memorizer 720, is used for depositing program.Specifically, program can include that program code, described program code include Computer-managed instruction.Memorizer 720 may comprise high-speed RAM memorizer, it is also possible to also includes non-volatile memories Device (non-volatile memory), for example, at least one disk memory.
Transceiver 730 is used for connecting other equipment, and communicates with other equipment.Transceiver 730 is used for: receive number According to inquiry request;The information of described tables of data is sent to each from node;Receive each from node send hashed value and The corresponding relation of table slicing files, the corresponding relation of described hashed value and table slicing files be by described from node to described number Carry out what hash fractionation obtained according to the tables of data indicated by the information of table;Receive each each hashed value pair sent from node The connection result answered;
Described processor 710 performs described program code, please for the described data query received according to transceiver 730 Ask and determine two tables of data needing to carry out hash fractionation;The described hashed value receiving transceiver 730 carries out segmentation and obtains To hashed value section, determine each described hashed value section corresponding from node;Right according to described hashed value and table slicing files Should be related to, control each and carry out table slicing files migration from node so that table slicing files corresponding to each hashed value section moves Move to each hashed value section described corresponding from node, in order to each from node by data said two tables of data The table slicing files of table merges according to hashed value, be combined according to hashed value after table slicing files and said two number Carry out the first outer connection obtaining the connection result that each hashed value is corresponding according to the table slicing files of another tables of data in table;Root The result of described data inquiry request is determined according to described connection result.
Wherein, described processor 710 can be also used for: true according to the described data inquiry request that transceiver 730 receives It is set for splitting foundation and the first outer literary name section connecting foundation for hash;
Described transceiver 730 can be also used for: the information of the described literary name section determined by described processor 710 is sent to respectively Individual from node.
Wherein, described processor 710 specifically may be used for: corresponding for each hashed value connection result is merged, and obtains the One outer connection result;The result of data inquiry request is determined according to described first outer connection result.
Wherein, described processor 710 specifically may be used for: controls transceiver 730 and sends the first migration to each from node Order, described first migrates order includes: table burst corresponding to each hashed value from the hashed value section that node is corresponding File and table slicing files place are from the information of node;Described first migrates order divides from described table from node for instruction Sheet file place obtains described table slicing files from node;
Described transceiver 730 can be also used for: sends the first migration order to each from node.
Wherein, described processor 710 specifically may be used for: controls transceiver 730 and sends the second migration to each from node Order, described second migrates order includes: the information from node that each hashed value section is corresponding;Described second migrates life Order for instruction from node by from this earth's surface of node slicing files send to table slicing files corresponding from node;Described table divides Sheet file corresponding from node be hashed value place hashed value section that table slicing files is corresponding corresponding from node;
Transceiver 730 can be also used for: sends the second migration order to each from node.
In the present embodiment, host node, according to described hashed value and the corresponding relation of table slicing files, controls each and enters from node Row table slicing files migrate, afterwards, by from node by the table slicing files of a tables of data in said two tables of data according to Hashed value merges, be combined according to hashed value after table slicing files and said two tables of data in another tables of data Table slicing files carry out the first outer connection obtaining the connection result that each hashed value is corresponding, host node receives each from node Connection result corresponding to each hashed value sent, and the result of described data inquiry request is determined according to described connection result. During it is achieved thereby that use left outside connection or right outer connection to carry out data query process under MPP distributed data base, it is possible to Obtain correct left outside connection or right outer connection result, and then can obtain according to described left outside connection or right outer connection result Correct data query result.
Seeing Fig. 8, for embodiment of the present invention host node structure chart, this host node 800 includes: processor 810, storage Device 820, transceiver 830 and bus 840;
Processor 810, memorizer 820, transceiver 830 are connected with each other by bus 840;Bus 840 can be ISA Bus, pci bus or eisa bus etc..Described bus can be divided into address bus, data/address bus, control bus etc.. For ease of representing, Fig. 8 only represents with a thick line, it is not intended that an only bus or a type of bus.
Memorizer 820, is used for depositing program.Specifically, program can include that program code, described program code include Computer-managed instruction.Memorizer 820 may comprise high-speed RAM memorizer, it is also possible to also includes non-volatile memories Device (non-volatile memory), for example, at least one disk memory.
Transceiver 830 is used for connecting other equipment, and communicates with other equipment.Transceiver 830 is used for: receive main The information of two tables of data that node sends;The corresponding relation of hashed value and table slicing files is sent to host node;To dissipate The connection result that train value is corresponding is sent to host node, in order to host node determines data inquiry request according to described connection result Result.
Described processor 810 performs described program code, for two to the described described information instruction local from node Tables of data carries out hash respectively and splits, and obtains the table slicing files that hashed value is corresponding;By receiving under described host node controls Send out device 830 and carry out table slicing files migration so that table slicing files corresponding to each hashed value section migrates to described each and dissipate Train value section corresponding from node;Hashed value segmentation is obtained by described hashed value section by host node, and each hashed value section is corresponding Determine by host node from node;Described control is entered according to the corresponding relation of hashed value and table slicing files by described host node OK;The table slicing files of a tables of data in said two tables of data is merged according to hashed value, obtains hashed value pair The table burst answered merges file;According to hashed value, described table burst is merged another number in file and said two tables of data Carry out the first outer connection according to the table slicing files of table, obtain the connection result that each hashed value is corresponding.
Wherein, described transceiver 830 can be also used for: receives outside the conduct hash fractionation foundation and first that host node sends Connect the information of the literary name section of foundation.
Wherein, described processor 810 is used for: described first received according to transceiver 830 migrates in order included The hashed value section corresponding from node table slicing files corresponding to each hashed value and table slicing files place from node Information, by transceiver 830 from described table slicing files place from node obtain described table slicing files;
Described transceiver 830 specifically may be used for: receiving the first migration order that described host node sends, described first moves Move order to include: table slicing files corresponding to each hashed value and table burst literary composition from the hashed value section that node is corresponding Part place is from the information of node;Described table slicing files is obtained from described table slicing files place from node.
Wherein, described processor 810 specifically may be used for: described second received according to transceiver 830 migrates order The information from node that the described hashed value section that includes is corresponding, will be from this earth's surface slicing files of node by transceiver 830 Send to table slicing files corresponding from node;
Described transceiver 830 specifically may be used for: receiving the second migration order that described host node sends, described second moves Move order to include: the information from node that each hashed value section is corresponding;To send to table from this earth's surface slicing files of node Slicing files corresponding from node;Described table slicing files corresponding from node be the hashed value place that table slicing files is corresponding Hashed value section corresponding from node.
In the present embodiment, described host node controls, carry out table slicing files migration from node so that each hashed value section Corresponding table slicing files migrate to each hashed value section described corresponding from node;Described hashed value section is by host node pair Hashed value segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control is by described host node root Carry out according to the corresponding relation of hashed value and table slicing files;By the table slicing files of a tables of data in said two tables of data Merge according to hashed value, obtain table burst corresponding to hashed value and merge file;According to hashed value, described table burst is closed And in file and said two tables of data, the table slicing files of another tables of data carries out first and outer connects, and obtains each hash The connection result that value is corresponding;Connection result corresponding for hashed value is sent to host node, in order to host node is according to described connection Result determines the result of data inquiry request.Thus coordinate with host node and achieve a use left side in MPP distributed data base When outer connection or right outer connection carry out data query process, it is possible to obtain correct left outside connection or right outer connection result, enter And obtain correct data query result according to described left outside connection or right outer connection result.
Those skilled in the art it can be understood that can add by software to the technology in the embodiment of the present invention required The mode of general hardware platform realizes.Based on such understanding, the technical scheme in the embodiment of the present invention substantially or Saying that the part contributing prior art can embody with the form of software product, this computer software product is permissible It is stored in storage medium, such as ROM/RAM, magnetic disc, CD etc., including some instructions with so that a computer Equipment (can be personal computer, server, or the network equipment etc.) performs each embodiment of the present invention or enforcement The method described in some part of example.
Each embodiment in this specification all uses the mode gone forward one by one to describe, identical similar part between each embodiment Seeing mutually, what each embodiment stressed is the difference with other embodiments.Especially for system For embodiment, owing to it is substantially similar to embodiment of the method, so describe is fairly simple, relevant part sees method The part of embodiment illustrates.
Invention described above embodiment, is not intended that limiting the scope of the present invention.Any the present invention's Amendment, equivalent and the improvement etc. made within spirit and principle, should be included within the scope of the present invention.

Claims (18)

1. a data query method, it is characterised in that including:
Receive data inquiry request;
Determine according to described data inquiry request and need to carry out two tables of data that hash splits;
The information of described tables of data is sent to each from node;
Receive each hashed value sent from node and the corresponding relation of table slicing files, described hashed value and table slicing files Corresponding relation be to be split obtained by the described hash that tables of data indicated by the information of described tables of data carried out from node;
Described hashed value is carried out segmentation and obtains hashed value section, determine each hashed value section corresponding from node;
According to described hashed value and the corresponding relation of table slicing files, control each and carry out table slicing files migration from node, The table slicing files making each hashed value section corresponding migrate to each hashed value section described corresponding from node, in order to each Individual from node, the table slicing files of a tables of data said two tables of data is merged according to hashed value, according to hash Value be combined after table slicing files and said two tables of data in the table slicing files of another tables of data carry out connecting outside first Connect and obtain the connection result that each hashed value is corresponding;
Receive each connection result corresponding to each hashed value sent from node, and determine according to described connection result described The result of data inquiry request;
Wherein, during said two tables of data is included in data query, to be connected as outside left outside connection and the right side processing object Two tables.
Method the most according to claim 1, it is characterised in that also include:
Determine according to described data inquiry request and split foundation and the first outer literary name section connecting foundation as hash;
The information of described literary name section is sent to each from node.
Method the most according to claim 1 and 2, it is characterised in that determine described number according to described connection result Include according to the result of inquiry request:
Connection result corresponding for each hashed value is merged, obtains the first outer connection result;
The result of data inquiry request is determined according to described first outer connection result.
Method the most according to claim 1, it is characterised in that control each and carry out table slicing files from node and move Shifting includes:
Sending the first migration order to each from node, described first migrates order includes: from the hashed value that node is corresponding In Duan, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;Described first migrates Order obtains described table slicing files from node from described table slicing files place from node for instruction.
Method the most according to claim 1, it is characterised in that control each and carry out table slicing files from node and move Shifting includes:
Sending the second migration order to each from node, described second migrates order includes: each hashed value section is corresponding From the information of node;Described second migrates order will divide to table from this earth's surface slicing files transmission of node from node for instruction Sheet file corresponding from node;What described table slicing files was corresponding is that the hashed value place that table slicing files is corresponding dissipates from node Train value section corresponding from node.
6. a data query method, it is characterised in that including:
Receive the information of two tables of data that host node sends;
Two tables of data of the described information instruction local from node are carried out hash fractionation respectively, obtains hashed value corresponding Table slicing files;
The corresponding relation of hashed value and table slicing files is sent to host node;
Table slicing files migration is carried out so that table slicing files corresponding to each hashed value section moves under described host node controls Move to each hashed value section described corresponding from node;Hashed value segmentation is obtained, respectively by described hashed value section by host node Individual corresponding being determined by host node from node of hashed value section;Described control is civilian according to hashed value and table burst by described host node The corresponding relation of part is carried out;
The table slicing files of a tables of data in said two tables of data is merged according to hashed value, obtains hashed value pair The table burst answered merges file;
According to hashed value, described table burst is merged the table burst literary composition of another tables of data in file and said two tables of data Part carries out the first outer connection, obtains the connection result that each hashed value is corresponding;
Connection result corresponding for hashed value is sent to host node, in order to according to described connection result, host node determines that data are looked into Ask the result of request;
Wherein, during said two tables of data is included in data query, to be connected as outside left outside connection and the right side processing object Two tables.
Method the most according to claim 6, it is characterised in that also include:
Receive the information splitting the literary name section connecting foundation outside foundation and first as hash that host node sends.
8. according to the method described in claim 6 or 7, it is characterised in that carry out table under described host node controls and divide Sheet file migration includes:
Receiving the first migration order that described host node sends, described first migrates order includes: from corresponding the dissipating of node In train value section, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;
Described table slicing files is obtained from described table slicing files place from node.
9. according to the method described in claim 6 or 7, it is characterised in that carry out table under described host node controls and divide Sheet file migration includes:
Receiving the second migration order that described host node sends, described second migrates order includes: each hashed value section is right The information from node answered;
According to the information from node that described hashed value section is corresponding, will send to table burst literary composition from this earth's surface slicing files of node Part corresponding from node;Described table slicing files corresponding from node be the hashed value place hashed value that table slicing files is corresponding Section corresponding from node.
10. a data query arrangement, it is characterised in that including:
First receives unit, is used for receiving data inquiry request;
First determines unit, determines that needs are carried out for receiving the described data inquiry request of unit reception according to described first Two tables of data that hash splits;
First transmitting element, for determining that by first the information of described tables of data that unit determines is sent to each from node;
Described first receives unit is additionally operable to: receive the corresponding pass of each hashed value sent from node and table slicing files System, the corresponding relation of described hashed value and table slicing files be by described from node to indicated by the information of described tables of data Tables of data carries out what hash fractionation obtained;
Segmenting unit, the described hashed value received for receiving unit to described first carries out segmentation and obtains hashed value section, Determine each hashed value section corresponding from node;
Control unit, for receiving, according to first, described hashed value and the corresponding relation of table slicing files that unit receives, Control each and carry out table slicing files migration from node so that table slicing files corresponding to each hashed value section migrates to described Each hashed value section corresponding from node, in order to the table of a tables of data said two tables of data is divided by each from node Sheet file merges according to hashed value, be combined according to hashed value after table slicing files and said two tables of data in another The table slicing files of one tables of data carries out the first outer connection obtaining the connection result that each hashed value is corresponding;
Described first receives unit is additionally operable to: receive each connection result corresponding to each hashed value sent from node;
For the described connection result received according to described first reception unit, query unit, determines that described data query please The result asked;
Wherein, during said two tables of data is included in data query, to be connected as outside left outside connection and the right side processing object Two tables.
11. devices according to claim 10, it is characterised in that described first determines that unit is additionally operable to: according to The described data inquiry request that described first reception unit receives determines as hash fractionation foundation and first outer connection foundation Literary name section;
Described first transmitting element is additionally operable to: determine that by described first the information of the described literary name section that unit determines is sent to respectively Individual from node.
12. devices according to claim 10, it is characterised in that query unit specifically for: by each hash The connection result merging that value is corresponding, obtains the first outer connection result;Data query is determined according to described first outer connection result The result of request.
13. according to the device described in any one of claim 10 to 12, it is characterised in that described control unit is specifically used In:
Sending the first migration order to each from node, described first migrates order includes: from the hashed value that node is corresponding In Duan, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;Described first migrates Order obtains described table slicing files from node from described table slicing files place from node for instruction.
14. according to the device described in any one of claim 10 to 12, it is characterised in that described control unit is specifically used In:
Sending the second migration order to each from node, described second migrates order includes: each hashed value section is corresponding From the information of node;Described second migrates order will divide to table from this earth's surface slicing files transmission of node from node for instruction Sheet file corresponding from node;What described table slicing files was corresponding is that the hashed value place that table slicing files is corresponding dissipates from node Train value section corresponding from node.
15. 1 kinds of data query arrangement, it is characterised in that including:
Second receives unit, for receiving the information of two tables of data that host node sends;
Split cells, for two numbers to the described information instruction that the described second reception unit local from node receives Carry out hash respectively to split according to table, obtain the table slicing files that hashed value is corresponding;
Second transmitting element, the corresponding relation for the hashed value obtained by split cells and table slicing files sends to main joint Point;
Controlled unit, carries out table slicing files migration under controlling at described host node so that each hashed value section is corresponding Table slicing files migrate to each hashed value section described corresponding from node;Described hashed value section by host node to hash Value segmentation obtains, each corresponding being determined by host node from node of hashed value section;Described control by described host node according to dissipate The corresponding relation of train value and table slicing files is carried out;
Second combining unit, in the said two tables of data after being migrated by described controlled unit, the table of a tables of data divides Sheet file merges according to hashed value, obtains table burst corresponding to hashed value and merges file;
Connect unit, merge file and said two for the described table burst the second combining unit obtained according to hashed value In tables of data, the table slicing files of another tables of data carries out the first outer connection, obtains the connection result that each hashed value is corresponding;
Described second transmitting element is additionally operable to: connection result corresponding to the hashed value that obtained by described connection unit is sent to main Node, in order to host node determines the result of data inquiry request according to described connection result;
Wherein, during said two tables of data is included in data query, to be connected as outside left outside connection and the right side processing object Two tables.
16. devices according to claim 15, it is characterised in that described second receives unit is additionally operable to: receive What host node sent splits foundation and the information of the first outer literary name section connecting foundation as hash.
17. according to the device described in claim 15 or 16, it is characterised in that described controlled unit specifically for:
Receiving the first migration order that described host node sends, described first migrates order includes: from corresponding the dissipating of node In train value section, table slicing files corresponding to each hashed value and table slicing files place are from the information of node;
Described table slicing files is obtained from described table slicing files place from node.
18. according to the device described in claim 15 or 16, it is characterised in that described controlled unit specifically for:
Receiving the second migration order that described host node sends, described second migrates order includes: each hashed value section is right The information from node answered;
According to the information from node that described hashed value section is corresponding, will send to table burst literary composition from this earth's surface slicing files of node Part corresponding from node;Described table slicing files corresponding from node be the hashed value place hashed value that table slicing files is corresponding Section corresponding from node.
CN201310459279.5A 2013-09-27 A kind of data query method and device Active CN103488778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310459279.5A CN103488778B (en) 2013-09-27 A kind of data query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310459279.5A CN103488778B (en) 2013-09-27 A kind of data query method and device

Publications (2)

Publication Number Publication Date
CN103488778A CN103488778A (en) 2014-01-01
CN103488778B true CN103488778B (en) 2016-11-30

Family

ID=

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6567806B1 (en) * 1993-01-20 2003-05-20 Hitachi, Ltd. System and method for implementing hash-based load-balancing query processing in a multiprocessor database system
CN1514976A (en) * 1998-07-24 2004-07-21 �ָ��� Distributed computer data base system and method for object searching
CN101719155A (en) * 2009-12-29 2010-06-02 北京航空航天大学 Method of multidimensional attribute range inquiry for supporting distributed multi-cluster computing environment
CN102467570A (en) * 2010-11-17 2012-05-23 日电(中国)有限公司 Connection query system and method for distributed data warehouse
CN102831120A (en) * 2011-06-15 2012-12-19 腾讯科技(深圳)有限公司 Data processing method and system
CN103246659A (en) * 2012-02-06 2013-08-14 阿里巴巴集团控股有限公司 Method and device for key value data query

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6567806B1 (en) * 1993-01-20 2003-05-20 Hitachi, Ltd. System and method for implementing hash-based load-balancing query processing in a multiprocessor database system
CN1514976A (en) * 1998-07-24 2004-07-21 �ָ��� Distributed computer data base system and method for object searching
CN101719155A (en) * 2009-12-29 2010-06-02 北京航空航天大学 Method of multidimensional attribute range inquiry for supporting distributed multi-cluster computing environment
CN102467570A (en) * 2010-11-17 2012-05-23 日电(中国)有限公司 Connection query system and method for distributed data warehouse
CN102831120A (en) * 2011-06-15 2012-12-19 腾讯科技(深圳)有限公司 Data processing method and system
CN103246659A (en) * 2012-02-06 2013-08-14 阿里巴巴集团控股有限公司 Method and device for key value data query

Similar Documents

Publication Publication Date Title
CN103620601B (en) Joining tables in a mapreduce procedure
US8224825B2 (en) Graph-processing techniques for a MapReduce engine
US7680848B2 (en) Reliable and scalable multi-tenant asynchronous processing
KR20210008142A (en) Technologies for file sharing
US20140358977A1 (en) Management of Intermediate Data Spills during the Shuffle Phase of a Map-Reduce Job
US9483515B2 (en) Managing a table of a database
CN104881466B (en) The processing of data fragmentation and the delet method of garbage files and device
MX2012003721A (en) Systems and methods for social graph data analytics to determine connectivity within a community.
CN102779183B (en) Data inquiry method, equipment and system
CN105677904B (en) Small documents storage method and device based on distributed file system
CN107333248B (en) A kind of real-time sending method of short message and system
CN108509437A (en) A kind of ElasticSearch inquiries accelerated method
US20160274874A1 (en) Method and apparatus for processing request
US9607043B2 (en) Peak data federation multizone splitting
CN108140176A (en) Search result is concurrently identified from the local search and long-range search to communication
WO2013163580A1 (en) Information providing method and system
US11216500B1 (en) Provisioning mailbox views
CN102170466A (en) Data processing method and system
CN106940712A (en) Sequence generating method and equipment
US10437822B2 (en) Grouping tables in a distributed database
CN110909072B (en) Data table establishment method, device and equipment
CN103488778B (en) A kind of data query method and device
CN110019456A (en) Data lead-in method, device and system
US10185729B2 (en) Index creation method and system
US11170000B2 (en) Parallel map and reduce on hash chains

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant