CN106874322A - A kind of data table correlation method and device - Google Patents

A kind of data table correlation method and device Download PDF

Info

Publication number
CN106874322A
CN106874322A CN201610480216.1A CN201610480216A CN106874322A CN 106874322 A CN106874322 A CN 106874322A CN 201610480216 A CN201610480216 A CN 201610480216A CN 106874322 A CN106874322 A CN 106874322A
Authority
CN
China
Prior art keywords
data
tables
associated key
sublist
point table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610480216.1A
Other languages
Chinese (zh)
Inventor
康树鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610480216.1A priority Critical patent/CN106874322A/en
Publication of CN106874322A publication Critical patent/CN106874322A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The present invention provides a kind of data table correlation method and device, and wherein method is applied to be associated the first tables of data and the second tables of data;Wherein, first tables of data includes:The non-inclined data outside the tilt data and the tilt data of data skew are can result in, the method includes:The first data point table is put into by extracting tilt data in the first tables of data, and non-inclined data are put into the second data point table;By extracting the data of matching association the first data point table in the second tables of data, the 3rd data point table is put into;First data point table and the 3rd data point table are carried out mapjoin and obtain the first contingency table, the second data point table and the second tables of data are associated, obtain the second contingency table;First contingency table and the second contingency table are combined, association results table is obtained, association results table is the result that the first tables of data is associated with the second tables of data.The present invention improves the efficiency of tables of data association.

Description

A kind of data table correlation method and device
Technical field
The present invention relates to data processing technique, more particularly to a kind of data table correlation method and device.
Background technology
When data warehouse carries out data cleansing, the conventional cleaning way of one of which is between tables of data and tables of data Association, the association between tables of data in this data warehouse is properly termed as join operations.In the tables of data of participation join generally With identical associated key (link field used when being associated between tables of data), if associated key is referred to as into key, for example, its In the corresponding relation of the key and information A is stored in a tables of data, the Key is stored in another tables of data corresponding with information B Relation, in both join, can be according to associated key key, by information A and information the B combination of the identical key of correspondence in a new number According in table, the new data table can include the key and corresponding information A, information B.
A kind of situation often occurred during join is data skew, and data skew is:Participate in the data of join In table, data record of one of tables of data comprising substantial amounts of identical key values, for example, certain user's logon information tables of data In, record the million or ten million data record (example that ID " 123 " this user is logged in different time respectively Such as, wherein a record is " ID 123 --- landing time 2016.3.21 ").The distribution for so being used in data warehouse When the join that the tables of data and other tables of data are processed in formula calculating platform is operated, generally calculating the time can be more long.
The content of the invention
In view of this, the present invention provides a kind of data table correlation method and device, there is the tables of data of data skew Between when associating, improve the efficiency of tables of data association.
Specifically, the present invention is achieved through the following technical solutions:
A kind of first aspect, there is provided data table correlation method, the method is applied to the first tables of data and the second tables of data It is associated;Wherein, first tables of data includes:Can result in the tilt data and the tilt data of data skew Outside non-inclined data, methods described includes:
The first data point table is put into by extracting the tilt data in first tables of data, and by the non-inclined data It is put into the second data point table;
By extracting the data of matching association the first data point table in second tables of data, the 3rd data point are put into Table;
First data point table and the 3rd data point table are carried out mapjoin and obtain the first contingency table, by described second Data point table and second tables of data are associated, and obtain the second contingency table;
First contingency table and the second contingency table are combined, association results table is obtained, the association results table is The result that first tables of data is associated with the second tables of data.
Second aspect, there is provided a kind of tables of data associated apparatus, described device is applied to the first tables of data and the second data Table is associated;Wherein, first tables of data includes:Can result in the tilt data and the inclination number of data skew Non-inclined data outside, described device includes:
Table split cells, for being put into the first data point table by extracting the tilt data in first tables of data, and The non-inclined data are put into the second data point table;
Table extraction unit, for the data by extracting matching association the first data point table in second tables of data, It is put into the 3rd data point table;
Table associative cell, for carrying out mapjoin and obtaining the first pass first data point table and the 3rd data point table Connection table, second data point table and second tables of data are associated, and obtain the second contingency table;
Table pack unit, for first contingency table and the second contingency table to be combined, obtains association results table, institute It is the result that first tables of data is associated with the second tables of data to state association results table.
The data table correlation method and device of the embodiment of the present invention, are torn open by by the tables of data comprising tilt data Point, the tilt data after fractionation and a small table are carried out into mapjoin, and by remaining data and another table join so that this two The tables of data association for dividing is influenceed all without by tilt data, improves the efficiency of tables of data association.
Brief description of the drawings
Fig. 1 is the flow chart of data table correlation method provided in an embodiment of the present invention;
Fig. 2 is the principle schematic of data table correlation method provided in an embodiment of the present invention;
Fig. 3 is the structural representation of tables of data associated apparatus provided in an embodiment of the present invention;
Fig. 4 is the structural representation of tables of data associated apparatus provided in an embodiment of the present invention;
Fig. 5 is a kind of hardware structure diagram of the processing equipment where tables of data associated apparatus provided in an embodiment of the present invention.
Specific embodiment
Data warehouse is mainly as Analysis of Policy Making provides data, and involved data manipulation is mainly data query, and In order to ensure that data warehouse provides the accuracy of data, the data into warehouse are typically passed through data cleansing.Tables of data is associated It is a kind of common method when data warehouse carries out data cleansing, for example, entering by map/reduce Distributed Computing Platforms During row data processing, the reduce stages can perform join according to the associated key in two or more tables of data to these tables (also referred to as cartesian product), such as, data warehouse receives a data inquiry request, and requesting query obtains corresponding certain key Information A and information B, and information A and information B respectively be located at two tables of data in, then can be according to Key to the two data Table is associated, and obtains a new data table comprising key and corresponding information A and information B, returns to inquiry.
For example, reduce nodes can obtain value list of the key identicals from two tables of data (can be in table The corresponding relation of key and value, such as, key is ID, and value is the landing time of the user), for same key, Join treatment is carried out to the data in the two tables of data.When data skew, because the bar number of some key is than other key Many a lot (as many as sometimes hundred times or thousand times), the data volume handled by reduce nodes where this key is saved than other Point is just much larger, so as to cause reduce node major parts to be finished, but has one or several reduce nodes to run It is very slow, slowly cannot run completely, also cause the process time extension of whole tables of data association.
The data table correlation method of the embodiment of the present application, it is intended to which the tables of data to there is data skew is associated When, the efficiency of tables of data association is improved, reduce influence of the data skew to the association process time.Fig. 1 illustrates tables of data pass The flow of linked method, the method can be performed by Distributed Computing Platform.In the example being illustrated in fig. 1 shown below, counted with to first Be associated as a example by join according to table and the second tables of data illustrating (but it is actual implement in the method can also be applied to other The association of the tables of data in scape, it is not limited to example below), further, it is also possible to the principle schematic with reference to shown in Fig. 2 is come Description the method:
For example, the first tables of data can be user's logon information table, referring to table 1 below, the first tables of data is illustrated Partial information, includes the corresponding relation of ID and the landing time of the user, and ID therein is properly termed as associated key, Join can be carried out between tables of data according to the associated key.
The tables of data of table 1 first
ID Landing time
123 2016.3.21
······ ·······
······ ·······
123 2016.3.24
234 2016.3.26
345 2016.3.27
Wherein, in the first tables of data, the data record of ID " 123 " has reached million or ten million bar, and assumes this The data of the ID " 123 " belong to " can result in the tilt data of data skew " in example, and remaining other data ratios Data record such as ID " 234 " and " 345 " belongs to non-inclined data, that is, do not result in data skew.With the first tables of data Second tables of data of join is carried out, can be an information table for user's name, for example, see shown in table 2 below, containing ID and user's name.
The tables of data of table 2 second
ID User's name
123 Zhang San
234 Li Si
345 King five
Association between this first tables of data and the second tables of data, i.e., according to this associated key of ID, by the first number According to landing time corresponding with ID and user's name is found in table and the second tables of data, generate shown in a similar table 3 Association results table, includes ID and landing time corresponding with the ID and user's name in the association results table.
The association results table of table 3
ID Landing time User's name
123 2016.3.21 Zhang San
······ ······· ·····
······ ······· ·····
123 2016.3.24 Zhang San
234 2016.3.26 Li Si
345 2016.3.27 King five
Below in conjunction with above-mentioned example, the process of the data table correlation method of the application is described:
In a step 101, it is put into the first data point table by extracting tilt data in the first tables of data, and by non-inclined data It is put into the second data point table.
In this step, the first tables of data is split, may be respectively referred to as the first data point table and the second data point Table.Wherein, tilt data can be included in the first data point table, such as the ID " 123 " corresponding million in table 1 or ten million The data record of bar, can include non-inclined data, such as number outside ID " 123 " in table 1 in the second data point table According to record.
A kind of following mode split to the first tables of data of example:
First, at least one associated key of data skew is caused by being extracted in the first tables of data, described at least one is closed Connection key is put into associated key sublist.
For example, the quantity of each associated key in the first tables of data can be counted, by each associated key according to quantity by many Sorted to few order.Used as an example, number of repetition of the ID " 123 " in the first tables of data is that quantity can be 1000000, the quantity of ID " 234 " can be 100000, and the quantity of ID " 345 " can be 8000.According to statistics The quantity of ID sorts from more to less, then be the order of " 123 --- 234 --- 345 ".
Assuming that associated key transformation set in advance is 1, i.e., the pass for making number one is selected from sequence above Connection key ID " 123 ", as the associated key for causing data skew.Again for example, in other examples, if in the first tables of data Including the quantity of different associated keys be ten, after counting the quantity of each associated key and sorting, will obtain sorting digit by the One sequence position to the tenth sort position one puts in order;If associated key transformation set in advance is 5, then it represents that be from First five associated key is selected in the sequence, first five associated key is the associated key that can cause data skew.The description of this example In, as a example by selecting an associated key, table 4 below is associated key sublist, causes an associated key of data skew in the table 4 In.
The associated key sublist of table 4
ID Statistical magnitude
123 1000000
Can be based on experience value or test value determines additionally, the numerical value of associated key transformation set in advance.Than Such as, the process time time-out that data skew is caused can be run into data cleansing, checks the associated key in this case causing time-out Number of repetition be how many, if it is 1,000,000, that indicates that 1,000,000 records will likely cause to process time delay.So in root It is such as 5 according to initial one associated key transformation of setting in the clooating sequence of associated key, selects the associated key of first five, if It was found that the statistical magnitude of the associated key of sequence the 5th is 8000, then shows that the associated key transformation is set and be not suitable for;If During by associated key transformation change 2, it is found that the deputy associated key statistical magnitude of sequence is 1,000,000, then show the upper limit number Value sets reasonable, and the data record of data skew can will be caused to choose.Certainly, it is more than a kind of mode of example, Associated key transformation can be determined using other modes, as long as tilt data can be recognized.
Secondly, according to associated key sublist, the data that association associated key sublist is matched in the first tables of data are put into the first number According to a point table, it is impossible to which the data of matching association associated key sublist are put into the second data point table.
For example, it is above-mentioned obtain associated key sublist after, the associated key sublist and the first tables of data can be associated, example Both mapjoin can be such as carried out, mapjoin is one kind of join modes, small table data can be directly read internal memory In be associated with another table, can be greatly improved generation association results efficiency.Associated key sublist such as table 4 in this example is One small table, can use mapjoin.When associated key sublist is associated with the first tables of data, the number of contingency table 4 can be matched According to the first data point table is put into, " matching association " here is referred to key pairs in the associated key sublist in the first tables of data The data record answered is found out, and in this example, the first data point table includes the corresponding data record of ID " 123 ";Can not The data of matching contingency table 4 are put into the data record outside the second data point table, i.e. ID " 123 ".
The data of table 5 first point table
ID Landing time
123 2016.3.21
······ ·······
······ ·······
123 2016.3.24
The data of table 6 second point table
ID Landing time
234 2016.3.26
345 2016.3.27
The mode split to the data in the first tables of data above by associated key sublist can have various, for example, A kind of mode can be that associated key sublist and the first tables of data are carried out into first time mapjoin, obtain matching association associated key The data of table, are put into the first data point table, i.e., what this mapjoin was obtained is the data that can associate associated key sublist;Can Second mapjoin is carried out with by associated key sublist and the first tables of data, obtains matching the data of association associated key sublist It is put into the second data point table.Again for example, another way can be, associated key sublist and the first tables of data are carried out once Mapjoin, the data by this mapjoin respectively to matching association associated key sublist associate associated key sublist with can not match Data be identified, that is, it is the data that can associate associated key sublist to identify upper a certain data, or can not be associated The data of associated key sublist;According to above-mentioned mark, the data that will match association associated key sublist are put into the first data point table, will not The data that association associated key sublist can be matched are put into the second data point table.Two ways is simply enumerated above, in actual implementation not This is confined to, as long as can realize that table 5 and the data of table 6 split.
In a step 102, by extracting the data of matching association the first data point table in the second tables of data, the 3rd data are put into Divide table.
For example, the second tables of data shown in the associated key sublist shown in table 4 and table 2 can be carried out into mapjoin, obtain The data record that can be associated with the matching of associated key sublist in second tables of data, the 3rd data point table is put into by the data record. Such as, in the above example, table 4 is associated with table 2 and obtains table 7, for example, the key in table 4 is ID " 123 ", that is just by table 2 In identical key be that the corresponding data record of ID " 123 " is put into table 7:
The data of table 7 the 3rd point table
ID User's name
123 Zhang San
In step 103, the first data point table and the 3rd data point table are carried out mapjoin and obtains the first contingency table, will Second data point table and the second tables of data carry out join, obtain the second contingency table.
In this step, the 3rd data point table is small table, the 3rd data point table can be carried out with the first data point table Mapjoin, obtains the first contingency table such as table 8 of both association results:
The contingency table of table 8 first
ID Landing time User's name
123 2016.3.21 Zhang San
······ ······· ·····
······ ······· ······
123 2016.3.24 Zhang San
Second data point table and the second tables of data carry out join, the second contingency table for obtaining, such as table 9 below:
The contingency table of table 9 second
At step 104, the first contingency table and the second contingency table are combined, obtain association results table, the association As a result table is the result that first tables of data is associated with the second tables of data.
In this step, the first contingency table and the second contingency table that will can be obtained in step 103 are combined, the pass for obtaining It is coupled shown in fruit table table 3 as above.
The data table correlation method of this example, is split, by tilt data by by the tables of data containing tilt data The small table that the data are matched with one carries out mapjoin, has been obviously improved the association process efficiency of this part tilt data, and another , when being associated with tables of data, due to the influence there is no tilt data, processing procedure can also be complete quickly for outer non-inclined data Into, above-mentioned two-part processing speed all quickly, so as to improve the efficiency of tables of data association.
In order to realize the above method, the embodiment of the present application additionally provides a kind of tables of data associated apparatus, and the device is applied to First tables of data and the second tables of data are associated;Wherein, first tables of data includes:Can result in data skew Non-inclined data outside tilt data and the tilt data.As shown in figure 3, the device can include:Table split cells 31st, table extraction unit 32, table associative cell 33 and table pack unit 34.Wherein,
Table split cells 31, for being put into the first data point table by extracting the tilt data in first tables of data, And the non-inclined data are put into the second data point table;
Table extraction unit 32, for the number by extracting matching association the first data point table in second tables of data According to being put into the 3rd data point table;
Table associative cell 33, for carrying out mapjoin and obtaining the first association the first data point table and the 3rd data point table Table, second data point table and second tables of data are associated, and obtain the second contingency table;
Table pack unit 34, for first contingency table and the second contingency table to be combined, obtains association results table, The association results table is the result that first tables of data is associated with the second tables of data.
As shown in figure 4, the table split cells 31 in the device can include:Key extracts subelement 311 and table generation is single Unit 312.
Key extracts subelement 311, at least one association for causing data skew by being extracted in first tables of data Key, at least one associated key is put into associated key sublist;
Table generates subelement 312, described by association is matched in first tables of data for according to the associated key sublist The data of associated key sublist are put into first data point table, it is impossible to which the data of the matching association associated key sublist are put into institute State the second data point table.
In another example, key extracts subelement 311, when for extracting associated key, including:Count first data The quantity of each associated key in table, the order by each associated key according to quantity from more to less is ranked up;According to setting in advance Fixed associated key transformation, obtains at least one associated key of the sequence digit within the associated key transformation, as At least one associated key for causing data skew.
In another example, table extraction unit 32, when for generating the 3rd data point table, including:By the associated key Sublist is associated with second tables of data, and the data for associating second tables of data for obtaining are put into the 3rd data Divide table.
The function of unit and the implementation process of effect correspond to step in specifically referring to the above method in said apparatus Implementation process, will not be repeated here.For device embodiment, because it corresponds essentially to embodiment of the method, so related Part is illustrated referring to the part of embodiment of the method.
Device embodiment described above is only schematical, wherein the unit illustrated as separating component can To be or may not be physically separate, the part shown as unit can be or may not be physics list Unit, you can with positioned at a place, or can also be distributed on multiple NEs.It can according to the actual needs be selected In some or all of module realize the purpose of application scheme.Those of ordinary skill in the art are not paying creative labor In the case of dynamic, you can to understand and implement.
The embodiment of the tables of data associated apparatus of the application can be using on a processing device, and the data processing equipment is for example Can carry out the computing device that data processing is used in data warehouse.Tables of data associated apparatus embodiment can be by software Realize, it is also possible to realized by way of hardware or software and hardware combining.As shown in figure 5, being the application tables of data associated apparatus A kind of hardware structure diagram of the processing equipment at place, it is implemented in software as a example by, as the device on a logical meaning, Ke Yitong Processor 51 in processing equipment where crossing it, corresponding computer program instructions in nonvolatile memory 53 are read Run in internal memory 52.In addition to including each above-mentioned component and network interface, generally acceptable basis should for the processing equipment The actual functional capability of processing equipment, can include other functions component, and this is repeated no more.
Presently preferred embodiments of the present invention is the foregoing is only, is not intended to limit the invention, it is all in essence of the invention Within god and principle, any modification, equivalent substitution and improvements done etc. should be included within the scope of protection of the invention.

Claims (10)

1. a kind of data table correlation method, it is characterised in that methods described is applied to enter the first tables of data and the second tables of data Row association;Wherein, first tables of data includes:Can result in data skew tilt data and the tilt data it Outer non-inclined data, methods described includes:
The first data point table is put into by extracting the tilt data in first tables of data, and the non-inclined data are put into Second data point table;
By extracting the data of matching association the first data point table in second tables of data, the 3rd data point table is put into;
First data point table and the 3rd data point table are carried out mapjoin and obtain the first contingency table, by second data Dividing table and second tables of data carries out join, obtains the second contingency table;
First contingency table and the second contingency table are combined, association results table is obtained, the association results table is described The result that first tables of data is associated with the second tables of data.
2. method according to claim 1, it is characterised in that described to be put by extracting the tilt data in the first tables of data Enter the first data point table, and the non-inclined data are put into the second data point table, including:
At least one associated key of data skew is caused by being extracted in first tables of data, at least one associated key is put In entering associated key sublist;
According to the associated key sublist, the data that the association associated key sublist is matched in first tables of data are put into described First data point table, it is impossible to which the data of the matching association associated key sublist are put into second data point table.
3. method according to claim 2, it is characterised in that described to cause data to incline by being extracted in first tables of data At least one oblique associated key, including:
Count the quantity of each associated key in first tables of data, the order by each associated key according to quantity from more to less It is ranked up;
According to associated key transformation set in advance, at least one of sequence digit within the associated key transformation is obtained Individual associated key, as at least one associated key for causing data skew.
4. method according to claim 2, it is characterised in that described by extracting matching association institute in second tables of data The data of the first data point table are stated, the 3rd data point table is put into, including:
The associated key sublist is associated with second tables of data, the data of second tables of data for obtaining will be associated It is put into the 3rd data point table.
5. method according to claim 2, it is characterised in that described according to the associated key sublist, by the described first number First data point table is put into according to the data that the association associated key sublist is matched in table, it is impossible to the matching association association The data of key sublist are put into second data point table, including:
The associated key sublist is carried out into first time mapjoin with first tables of data, the matching association associated key is obtained The data of sublist are put into first data point table;The associated key sublist and first tables of data are carried out second Mapjoin, obtains matching the data for associating the associated key sublist and is put into second data point table;
Or, the associated key sublist is carried out into a mapjoin with first tables of data, respectively to the matching association pass Join the data of key sublist and the data that associate the associated key sublist can not be matched be identified;According to the mark, will be described The data of the matching association associated key sublist are put into first data point table, it is impossible to the matching association associated key sublist Data be put into second data point table.
6. a kind of tables of data associated apparatus, it is characterised in that described device is applied to enter the first tables of data and the second tables of data Row association;Wherein, first tables of data includes:Can result in data skew tilt data and the tilt data it Outer non-inclined data, described device includes:
Table split cells, for being put into the first data point table by extracting the tilt data in first tables of data, and by institute State non-inclined data and be put into the second data point table;
Table extraction unit, for the data by extracting matching association the first data point table in second tables of data, is put into 3rd data point table;
Table associative cell, for carrying out mapjoin and obtaining the first contingency table first data point table and the 3rd data point table, Second data point table and second tables of data are carried out into join, the second contingency table is obtained;
Table pack unit, for first contingency table and the second contingency table to be combined, obtains association results table, the pass It is the result that first tables of data is associated with the second tables of data to be coupled fruit table.
7. device according to claim 6, it is characterised in that the table split cells includes:
Key extracts subelement, at least one associated key for causing data skew by being extracted in first tables of data, by institute At least one associated key is stated to be put into associated key sublist;
Table generates subelement, for according to the associated key sublist, the association associated key being matched in first tables of data The data of sublist are put into first data point table, it is impossible to which the data of the matching association associated key sublist are put into described second Data point table.
8. device according to claim 7, it is characterised in that the key extracts subelement, when for extracting associated key, Including:The quantity of each associated key in first tables of data is counted, by each associated key according to quantity from more to less suitable Sequence is ranked up;According to associated key transformation set in advance, acquisition sequence digit is within the associated key transformation At least one associated key, as at least one associated key for causing data skew.
9. device according to claim 7, it is characterised in that
The table extraction unit, when for generating the 3rd data point table, including:By the associated key sublist and the described second number It is associated according to table, the data for associating second tables of data for obtaining is put into the 3rd data point table.
10. device according to claim 7, it is characterised in that the table generates subelement, is used for:
The associated key sublist is carried out into first time mapjoin with first tables of data, the matching association associated key is obtained The data of sublist are put into first data point table;The associated key sublist and first tables of data are carried out second Mapjoin, obtains matching the data for associating the associated key sublist and is put into second data point table;
Or, the associated key sublist is carried out into a mapjoin with first tables of data, respectively to the matching association pass Join the data of key sublist and the data that associate the associated key sublist can not be matched be identified;According to the mark, will be described The data of the matching association associated key sublist are put into first data point table, it is impossible to the matching association associated key sublist Data be put into second data point table.
CN201610480216.1A 2016-06-27 2016-06-27 A kind of data table correlation method and device Pending CN106874322A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610480216.1A CN106874322A (en) 2016-06-27 2016-06-27 A kind of data table correlation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610480216.1A CN106874322A (en) 2016-06-27 2016-06-27 A kind of data table correlation method and device

Publications (1)

Publication Number Publication Date
CN106874322A true CN106874322A (en) 2017-06-20

Family

ID=59239288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610480216.1A Pending CN106874322A (en) 2016-06-27 2016-06-27 A kind of data table correlation method and device

Country Status (1)

Country Link
CN (1) CN106874322A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536824A (en) * 2018-04-10 2018-09-14 中国农业银行股份有限公司 A kind of data processing method and device
CN109684300A (en) * 2018-11-20 2019-04-26 成都四方伟业软件股份有限公司 One kind being based on visual big data warehouse design method and system
CN109710681A (en) * 2018-12-29 2019-05-03 亚信科技(南京)有限公司 Data output method, device, computer equipment and storage medium
CN112597148A (en) * 2020-11-25 2021-04-02 联想(北京)有限公司 Data table connection method and device
CN112732715A (en) * 2020-12-31 2021-04-30 星环信息科技(上海)股份有限公司 Data table association method, device and storage medium
WO2023045295A1 (en) * 2021-09-27 2023-03-30 北京沃东天骏信息技术有限公司 Data skew processing method, device, storage medium, and program product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239529A (en) * 2014-09-19 2014-12-24 浪潮(北京)电子信息产业有限公司 Method and device for preventing Hive data from being inclined
CN104298771A (en) * 2014-10-30 2015-01-21 南京信息工程大学 Massive web log data query and analysis method
CN105095413A (en) * 2015-07-09 2015-11-25 北京京东尚科信息技术有限公司 Method and apparatus for solving data skew
CN105468592A (en) * 2014-08-04 2016-04-06 北京奇虎科技有限公司 Display method and device of associated data table
CN105701215A (en) * 2016-01-13 2016-06-22 北京中交兴路信息科技有限公司 Hadoop MapReduce-based data connection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468592A (en) * 2014-08-04 2016-04-06 北京奇虎科技有限公司 Display method and device of associated data table
CN104239529A (en) * 2014-09-19 2014-12-24 浪潮(北京)电子信息产业有限公司 Method and device for preventing Hive data from being inclined
CN104298771A (en) * 2014-10-30 2015-01-21 南京信息工程大学 Massive web log data query and analysis method
CN105095413A (en) * 2015-07-09 2015-11-25 北京京东尚科信息技术有限公司 Method and apparatus for solving data skew
CN105701215A (en) * 2016-01-13 2016-06-22 北京中交兴路信息科技有限公司 Hadoop MapReduce-based data connection method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536824A (en) * 2018-04-10 2018-09-14 中国农业银行股份有限公司 A kind of data processing method and device
CN108536824B (en) * 2018-04-10 2020-11-20 中国农业银行股份有限公司 Data processing method and device
CN109684300A (en) * 2018-11-20 2019-04-26 成都四方伟业软件股份有限公司 One kind being based on visual big data warehouse design method and system
CN109710681A (en) * 2018-12-29 2019-05-03 亚信科技(南京)有限公司 Data output method, device, computer equipment and storage medium
CN109710681B (en) * 2018-12-29 2021-09-17 亚信科技(南京)有限公司 Data output method and device, computer equipment and storage medium
CN112597148A (en) * 2020-11-25 2021-04-02 联想(北京)有限公司 Data table connection method and device
CN112732715A (en) * 2020-12-31 2021-04-30 星环信息科技(上海)股份有限公司 Data table association method, device and storage medium
CN112732715B (en) * 2020-12-31 2023-08-25 星环信息科技(上海)股份有限公司 Data table association method, device and storage medium
WO2023045295A1 (en) * 2021-09-27 2023-03-30 北京沃东天骏信息技术有限公司 Data skew processing method, device, storage medium, and program product

Similar Documents

Publication Publication Date Title
CN106874322A (en) A kind of data table correlation method and device
CN1530857B (en) Method and device for document and pattern distribution
CN103514201B (en) Method and device for querying data in non-relational database
CN108197285A (en) A kind of data recommendation method and device
CN101937436B (en) Text classification method and device
CN111581092B (en) Simulation test data generation method, computer equipment and storage medium
CN104796300B (en) A kind of packet feature extracting method and device
CN104217015B (en) Based on the hierarchy clustering method for sharing arest neighbors each other
DE102013221125A1 (en) System, method and computer program product for performing a string search
CN107180093A (en) Information search method and device and ageing inquiry word recognition method and device
Chen et al. Complex network comparison based on communicability sequence entropy
CN103646074B (en) It is a kind of to determine the method and device that picture cluster describes text core word
CN109582808A (en) A kind of user information querying method, device, terminal device and storage medium
CN104239321B (en) A kind of data processing method and device of Search Engine-Oriented
CN109145003A (en) A kind of method and device constructing knowledge mapping
CN104199945A (en) Data storing method and device
CN114817575B (en) Large-scale electric power affair map processing method based on extended model
CN110457704B (en) Target field determination method and device, storage medium and electronic device
CN104462347B (en) The sorting technique and device of keyword
CN106484889A (en) The flooding method and apparatus of Internet resources
EP3955256A1 (en) Non-redundant gene clustering method and system, and electronic device
CN107229605A (en) The computational methods and device of text similarity
CN102915313B (en) Error correction relation generation method and system in web search
CN106844743B (en) Emotion classification method and device for Uygur language text
CN108133030A (en) A kind of realization method and system for painting this question and answer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201014

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201014

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170620