CN104182502B - A kind of data pick-up method and device - Google Patents

A kind of data pick-up method and device Download PDF

Info

Publication number
CN104182502B
CN104182502B CN201410406481.6A CN201410406481A CN104182502B CN 104182502 B CN104182502 B CN 104182502B CN 201410406481 A CN201410406481 A CN 201410406481A CN 104182502 B CN104182502 B CN 104182502B
Authority
CN
China
Prior art keywords
data
partition
thread
data partition
thread count
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410406481.6A
Other languages
Chinese (zh)
Other versions
CN104182502A (en
Inventor
曹连超
辛国茂
亓开元
刘伟
李占强
卢军佐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201410406481.6A priority Critical patent/CN104182502B/en
Publication of CN104182502A publication Critical patent/CN104182502A/en
Application granted granted Critical
Publication of CN104182502B publication Critical patent/CN104182502B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Abstract

The present invention provides a kind of data pick-up method, and applied to relevant database, methods described includes:According to the codomain distribution of certain field in the tables of data of selection, the tables of data is divided into M data partition, the type of the field can be converted into numerical value for the value of numeric type or the field;The weight of each data partition is calculated according to the number of data lines of each data partition;It is each data partition distribution Thread Count according to the weight of each data partition;The summation of each Thread Count of each data partition distribution is equal to default total Thread Count N, wherein M≤N;N number of thread is opened, according to the Thread Count distributed, data pick-up is carried out using the thread of respective numbers to each data partition respectively.The present invention is by the way that to tables of data is divided into some data partitions, the Thread Count of each data partition of dynamically distributes solves the problem of each thread distribution data are uneven, improves the data pick-up efficiency of relational data.

Description

A kind of data pick-up method and device
Technical field
The present invention relates to data pick-up field, and in particular to the data pick-up method and device of relevant database.
Background technology
Data integration is that the data of separate sources, form and feature logically or are physically organically concentrated, so that Comprehensive data sharing is provided, is enterprise commerce intelligence, the important component of data warehouse.ETL is business data collection Into primary solutions.That three letters are represented respectively in ETL is Extract, Transform, Load, that is, extract, change, Loading.Data pick-up is the process that data are extracted from data source.In practical application, data source is more to use relationship type number According to storehouse.
The mode of data is extracted from relevant database can be divided into directly export Backup Data and be connect by JDBC etc. Mouth reads the modes such as data.It is wherein more flexible by way of the reading of the interfaces such as ODBC or JDBC, it can not only carry out data Full dose extract, increment extraction can be carried out again.However, if not by way of the interfaces such as ODBC or JDBC extract data Using multi-threaded parallel, efficiency can be than relatively low, today that particularly the big data epoch arrive, it is often necessary to extract with upper The database table of hundred million datas.Multi-threaded parallel, which extracts data, to be needed to carry out pre-segmentation to the data in data source, if each The Data Entry skewness of thread distribution, the efficiency of multithreading can have a greatly reduced quality;But if it is intended to allow each thread to distribute Data it is visibly homogeneous, it is necessary to calculate the detailed distribution situation of data in tables of data, so need to do big before data are extracted The efficiency of data is extracted in the database manipulation of amount, influence.This patent proposes the concept of the pre- subregion of data, passes through simple database Pre-operation obtains the data strip mesh number of each data partition, and is that each subregion dynamically distributes extract data according to data strip mesh number Thread, can effectively solve above-mentioned problem.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of data pick-up method of relevant database, to improve number According to the efficiency of extraction.
In order to solve the above-mentioned technical problem, the invention provides a kind of data pick-up method, applied to relevant database, Methods described includes:
The weight of each data partition is calculated according to the number of data lines of each data partition;
It is each data partition distribution Thread Count according to the weight of each data partition;For each data partition point The summation for the Thread Count matched somebody with somebody is equal to default total Thread Count N, wherein M≤N;
N number of thread is opened, according to the Thread Count distributed, the thread of respective numbers is used to each data partition respectively Carry out data pick-up.
It is preferred that
The weight for calculating each data partition according to the number of data lines of each data partition includes:
Obtain the number of data lines C of each data partitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight sum of each data partition For 1;
The weight according to each data partition includes for each data partition distribution Thread Count:
For than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
It is preferred that
Distributed according to the weight of each data partition for each data partition after Thread Count, it is described to open N number of line Cheng Qian, in addition to:
If the Thread Count of data partition distribution is more than or equal to 2, the data partition is divided into data child partition, the number It is the Thread Count that the data partition is distributed, each data child partition pair of the data partition according to the number of the data child partition of subregion Answer a thread.
It is preferred that
Distributed according to the weight of each data partition for each data partition after Thread Count, it is described to open N number of line Cheng Qian, in addition to:
I-th of data partition is merged with j-th of data partition, wherein the Thread Count of i-th of data partition distribution For 0, the Thread Count of j-th of data partition distribution is not 0, and 1≤i≤M, 1≤j≤M, i is not equal to j.
It is preferred that
It is described according to the Thread Count distributed, data are carried out using the thread of respective numbers to each data partition respectively Extraction includes:
Respectively according to the Thread Count for each data child partition for distributing to each data partition, using the thread of respective numbers Data pick-up is carried out to each data child partition.
The present invention also provides a kind of data pick-up device, applied to relevant database, described device include division module, Distribute module and abstraction module, wherein,
The codomain that the division module is used for certain field in the tables of data according to selection is distributed, and the tables of data is divided into M Individual data partition, the type of the field can be converted into numerical value for the value of numeric type or the field;
The distribute module further comprises weight calculation unit and thread allocation unit;
The weight calculation unit is used to calculate each data partition according to the number of data lines of each data partition Weight;
The thread allocation unit is used to distribute thread according to the weight of each data partition for each data partition Number;The summation of each Thread Count of each data partition distribution is equal to default total Thread Count N, wherein M≤N;
The abstraction module is used to open N number of thread, and according to the Thread Count distributed, each data partition is adopted respectively Data pick-up is carried out with the thread of respective numbers
It is preferred that
The weight calculation unit is used to calculate each data partition according to the number of data lines of each data partition Weight refers to:
Obtain the number of data lines C of each data partitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight sum of each data partition For 1;
The thread allocation unit is used to distribute thread according to the weight of each data partition for each data partition Number refers to:
For than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
It is preferred that
Described device also includes child partition module,
The child partition module is used for when thread allocation unit is that the Thread Count that data partition is distributed is more than or equal to 2, then The data partition is divided into data child partition, the number of the data child partition of the data partition is the line that the data partition is distributed Number of passes, each data child partition one thread of correspondence of the data partition.
It is preferred that
Described device also includes merging module,
The merging module is used to merge i-th of data partition with j-th of data partition, wherein i-th of data The Thread Count of subregion distribution is 0, and the Thread Count of j-th of data partition distribution is not 0,1≤i≤M, 1≤j≤M, i In j.
It is preferred that
The abstraction module uses the thread of respective numbers to each data partition respectively according to the Thread Count distributed Data pick-up is carried out to refer to:
Respectively according to the Thread Count for each data child partition for distributing to each data partition, using the thread of respective numbers Data pick-up is carried out to each data child partition.
Such scheme by tables of data to being divided into some data partitions, and the Thread Count of each data partition of dynamically distributes is solved Each thread distributes the problem of data are uneven, improves the data pick-up efficiency of relational data.
Brief description of the drawings
Fig. 1 is the flow chart of the data pick-up method in the embodiment of the present invention one;
Fig. 2 is the data partition schematic diagram of the data pick-up method in the embodiment of the present invention one;
Fig. 3 is the data partition schematic diagram of the data pick-up method in the embodiment of the present invention one;
Fig. 4 is the structural representation of the data pick-up device in the embodiment of the present invention one.
Embodiment
For the purpose, technical scheme and advantage of the application are more clearly understood, below in conjunction with accompanying drawing to the application Embodiment be described in detail.It should be noted that in the case where not conflicting, in the embodiment and embodiment in the application Feature can mutually be combined.
It is of the invention to avoid distributing data uneven caused inefficiency during multithreading extraction data between thread for effect Problem, proposes that the data interval that will extract data carries out the concept of subregion, and then the weight for calculating each subregion is each point Area's dynamically distributes extract the thread of data, and user can set number of partitions and the quantity of thread according to actual conditions, set number According to subregion as global issue can be regarded to local problem one by one and go solution, provided for the distribution thread being reasonably distributed of data Source.Below in conjunction with the accompanying drawings, the implementation steps to the present invention are described in detail.
Embodiment one
As shown in figure 1, the data pick-up method in the present invention applied to relevant database includes:
S101:According to the codomain distribution of certain field in the tables of data of selection, the tables of data is divided into M data partition; The type of the field can be converted into numerical value for the value of numeric type or the field;
Total Thread Count N that user can be distributed with the number M of preliminary setting data subregion and needs.
Specifically, and after a certain field id is selected, minimum values and maximum Min of the inquiry field id in database (id) SQL statement and Max (id), is performed in relevant database by ODBC or JDBC interfaces:
Select max (id), min (id) from [table name]
Field id codomain [Min (id), Max (id)] is averagely divided into M data partition.As shown in Fig. 2 according to word Section id minimum M in (id) and the interval of maximum Max (id) M data partition of mean allocation, and 1 to M is set as each point The numbering in area.
IfThe interval of than the m-th data subregion is RG (m), and interval right boundary is respectively RleftAnd R (m)right(m), then the interval expression formula of than the m-th data subregion is:
S102:The weight of each data partition is calculated according to the number of data lines of each data partition;
The number of data lines C of each data partition is stated firstly the need of acquisitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight of each data partition it With for 1.
Be in practical operation, can be parallel by the database interfaces such as ODBC or JDBC perform SQL query statement obtain Take the number of data lines of M data partition.Subregion (1≤the m for being m for numbering<M), corresponding thread passes through ODBC or JDBC Interface performs SQL query statement in relevant database:
Select count (*) from [table name] where id>=Rleft(m)and id<Rright(m)
The subregion for being m=M for numbering, corresponding thread is held by ODBC or JDBC interfaces in relevant database Row SQL query statement:
Select count (*) from [table name] where id>=Rleft(m)and id<=Max (id)
If the line number of m-th of the subregion obtained is Cm.The total line number C for the tables of data then to be extracted value is:
C=C1+…+Cm+…+CM,1≤m≤M
The weights that than the m-th data subregion can be set according to formula below are wm, wmMeet following multinomial:
In the present embodiment, according to above-mentioned calculation formula, the number of data lines of data partition is more, and its corresponding weight is got over Greatly.
The weight of each data partition can also be set according to Else Rule in other embodiments.
S103:It is each data partition distribution Thread Count according to the weight of each data partition;Each data point The summation of each Thread Count of area's distribution is equal to default total Thread Count N, wherein M≤N;
It is the Thread Count that each subregion dynamically distributes extract data according to the weight of each data partition.
Ideally, it is than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
Due to wmN is possible for decimal, if ndec(m)=wmN-INT(wmN),
To set { ndec(1),…,ndec(m),…,ndec(M) } element in (1≤m≤M) is traveled through, from big to small Take preceding NoThe partition number m of individual element value constitutes new set K, ifkx∈ K, be by partition number kxData partition distribution Thread Count add 1, i.e., numbering be kxData partition extract data Thread Count be:nint(kx)+1。
So far, all N number of threads have been assigned.
S104:N number of thread is opened, according to the Thread Count distributed, respective numbers are used to each data partition respectively Thread carry out data pick-up
Specifically, respectively according to the Thread Count for each data child partition for distributing to each data partition, using respective numbers Thread carries out data pick-up to each data child partition
During concrete operations, if the right boundary value between the corresponding data sub-area for extracting data of each thread is respectively rleft And r (x)right(x), as 1≤x<When N, following SQL query is performed in relational database by ODBC or JDBC interfaces Sentence:
Select [field 1], [field 2] .., from [table name] where id>=rleft(x)and id<rright(x)
As x=N, following SQL statement is performed in relational database by ODBC or JDBC interfaces:
Select [field 1], [field 2] .., from [table name] where id>=rleft(x)and id<=rright (x)。
Preferably,
After step S103, before S104, it can also include:
S3011:If the Thread Count of data partition distribution is more than or equal to 2, the data partition is divided into data son point Area, the number of the data child partition of the data partition is the Thread Count that the data partition is distributed, each data of the data partition Child partition one thread of correspondence.
In concrete operations, if the Thread Count for the subregion distribution that numbering is m is nc(m), ifIt is single The right boundary value that each thread extracts data inside individual subregion is set to rleftAnd r (x)right(x), wherein x is thread number (1 ≤x≤nc(m))。
If nc(m) it is not equal to 0, x-th of thread extracts the subinterval rg of data inside the subregion that numbering is mm(x) expression Formula is:
Preferably,
After step S103, before S104, it can also include:
S1032:I-th of data partition is merged with j-th of data partition, wherein i-th of data partition distribution Thread Count is 0, and the Thread Count of j-th of data partition distribution is not 0, and 1≤i≤M, 1≤j≤M, i is not equal to j.
The step is will to distribute the interval data point non-zero with distribution Thread Count that is closing on for the data partition that Thread Count is 0 Merge between the adjacent subarea in area.If some data partitions are assigned with the thread of 0 extraction data, but can in these data partitions Can be containing data, it is necessary to which the interval of these data partitions to be merged into the adjacent son that the distribution Thread Count closed on is more than 0 subregion In interval.Acquiescence will be distributed during Thread Count is merged between the adjacent subarea of right partition for 0 data partition;If distributing Thread Count The end of whole data interval is in for 0 data partition, the data partition is merged into the adjacent subarea of left data subregion Between in.
Specifically in operation, it can operate by the following method:
If 1) Thread Count of m-th of subregion distribution is more than 0, i.e. n for the Thread Count of 0 and the adjacent subregion distribution in the rightc(m) Equal to 0 and nc(m+1)>0, as shown in figure 3, the numbering that acquiescence closes on the interval RG (m) for numbering the data partition for being m with the right For the 1st data subinterval rg inside m+1 data partitionm+1(1) merge, i.e. rgm+1(1)=rgm+1(1)∪RG(m)。
If 2) Thread Count of m-th subregion distribution is more than 0 (n for the Thread Count of 0 and the adjacent subregion distribution in the left sidec(M) etc. In 0 and nc(M-1)>0) subregion that, the interval RG (M) for numbering the data partition for the being M numberings closed on the left side are M-1 by acquiescence Inside n-thc(M-1) individual data subinterval rgM-1(nc(M-1)) merge, i.e. rg (inside subregion between the data sub-area of rightmost)M-1 (nc(M-1))=rgM-1(nc(M-1))∪RG(M)。
If 3) there is the data partition that continuous multiple distribution Thread Counts are 0, by these data partitions merge then perform 1) or Person 2).
The boundary value of data is extracted between data sub-area after merging as each thread.
As shown in figure 4, the present embodiment one also provides a kind of data pick-up device, including:Including division module 11, distribution mould Block 12 and abstraction module 13, wherein,
The codomain that the division module 11 is used for certain field in the tables of data according to selection is distributed, and the tables of data is divided into M data partition, the type of the field can be converted into numerical value for the value of numeric type or the field;
The distribute module 12 further comprises weight calculation unit 121 and thread allocation unit 122;
The weight calculation unit 121 is used to calculate each data partition according to the number of data lines of each data partition Weight;
The thread allocation unit 122 is used to distribute line according to the weight of each data partition for each data partition Number of passes;The summation of each Thread Count of each data partition distribution is equal to default total Thread Count N, wherein M≤N;
The abstraction module 13 is used to open N number of thread, according to the Thread Count distributed, respectively to each data partition Data pick-up is carried out using the thread of respective numbers.
It is preferred that
The weight calculation unit 121 is used to calculate each data partition according to the number of data lines of each data partition Weight refer to:
Obtain the number of data lines C of each data partitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight sum of each data partition For 1;
The thread allocation unit 122 is used to distribute line according to the weight of each data partition for each data partition Number of passes refers to:
For than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
It is preferred that described device also includes child partition module 14,
The child partition module 14 is used for when thread allocation unit is that the Thread Count that data partition is distributed is more than or equal to 2, The data partition is then divided into data child partition, the number of the data child partition of the data partition is data partition distribution Thread Count, each data child partition one thread of correspondence of the data partition.
It is preferred that described device also includes merging module 15,
The merging module 15 is used to merge i-th of data partition with j-th of data partition, wherein i-th of number The Thread Count distributed according to subregion is 0, and the Thread Count of j-th of data partition distribution is not 0,1≤i≤M, 1≤j≤M, and i is not Equal to j.
It is preferred that
The abstraction module 13 uses the line of respective numbers to each data partition respectively according to the Thread Count distributed Cheng Jinhang data pick-ups refer to:
Respectively according to the Thread Count for each data child partition for distributing to each data partition, using the thread of respective numbers Data pick-up is carried out to each data child partition.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuits to realize, accordingly Each module/module in ground, above-described embodiment can be realized in the form of hardware, it would however also be possible to employ the shape of software function module Formula is realized.The application is not restricted to the combination of the hardware and software of any particular form.
The preferred embodiment of the application is the foregoing is only, the application is not limited to, for the skill of this area For art personnel, the application can have various modifications and variations.It is all within spirit herein and principle, made any repair Change, equivalent substitution, improvement etc., should be included within the protection domain of the application.

Claims (8)

1. a kind of data pick-up method, applied to relevant database, it is characterised in that methods described includes:
According to the codomain distribution of certain field in the tables of data of selection, the tables of data is divided into M data partition, the field Type can be converted into numerical value for the value of numeric type or the field;
The weight of each data partition is calculated according to the number of data lines of each data partition;
It is each data partition distribution Thread Count according to the weight of each data partition;It is each that each data partition is distributed The summation of Thread Count is equal to default total Thread Count N, wherein M≤N;
N number of thread is opened, according to the Thread Count distributed, each data partition is carried out using the thread of respective numbers respectively Data pick-up;
Distributed according to the weight of each data partition for each data partition after Thread Count, it is described to open before N number of thread, Also include:
I-th of data partition is merged with j-th of data partition, wherein the Thread Count of i-th of data partition distribution is 0, The Thread Count of j-th of data partition distribution is not 0, and 1≤i≤M, 1≤j≤M, i is not equal to j.
2. the method as described in claim 1, it is characterised in that:
The weight for calculating each data partition according to the number of data lines of each data partition includes:
Obtain the number of data lines C of each data partitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight sum of each data partition is 1;
The weight according to each data partition includes for each data partition distribution Thread Count:
For than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
3. method as claimed in claim 2, it is characterised in that:
Distributed according to the weight of each data partition for each data partition after Thread Count, it is described to open before N number of thread, Also include:
If the Thread Count of data partition distribution is more than or equal to 2, the data partition is divided into data child partition, the data point The number of the data child partition in area is the Thread Count that the data partition is distributed, each data child partition correspondence one of the data partition Individual thread.
4. method as claimed in claim 3, it is characterised in that:
It is described according to the Thread Count distributed, data pick-up is carried out using the thread of respective numbers to each data partition respectively Including:
Respectively according to the Thread Count for each data child partition for distributing to each data partition, using the thread of respective numbers to each Data child partition carries out data pick-up.
5. a kind of data pick-up device, applied to relevant database, it is characterised in that described device includes division module, divided With module and abstraction module, wherein,
The codomain that the division module is used for certain field in the tables of data according to selection is distributed, and the tables of data is divided into M numbers According to subregion, the type of the field can be converted into numerical value for the value of numeric type or the field;
The distribute module further comprises weight calculation unit and thread allocation unit;
The weight calculation unit is used for the weight that each data partition is calculated according to the number of data lines of each data partition;
The thread allocation unit is used to distribute Thread Count according to the weight of each data partition for each data partition;Institute The summation for stating each Thread Count of each data partition distribution is equal to default total Thread Count N, wherein M≤N;
The abstraction module is used to open N number of thread, according to the Thread Count distributed, uses phase to each data partition respectively The thread of quantity is answered to carry out data pick-up;
Described device also includes merging module,
The merging module is used to merge i-th of data partition with j-th of data partition, wherein i-th of data partition The Thread Count of distribution is 0, and the Thread Count of j-th of data partition distribution is not 0, and 1≤i≤M, 1≤j≤M, i is not equal to j.
6. device as claimed in claim 5, it is characterised in that:
The weight calculation unit is used for the weight that each data partition is calculated according to the number of data lines of each data partition Refer to:
Obtain the number of data lines C of each data partitionm, 1≤m≤M;
The weight of than the m-th data subregion is wm,C=C1+…+Cm+…+CM, the weight sum of each data partition is 1;
The thread allocation unit is used for Refer to:
For than the m-th data subregion distribution Thread Count INT (wmN), INT is to round downwards;
By remaining unappropriated Thread Count NoIt is assigned to the N in all data partitionsoIn individual data partition, wherein,
7. device as claimed in claim 6, it is characterised in that described device also includes child partition module,
The child partition module is used to when thread allocation unit is that the Thread Count that data partition is distributed is more than or equal to 2, then should Data partition is divided into data child partition, and the number of the data child partition of the data partition is the thread that the data partition is distributed Number, each data child partition one thread of correspondence of the data partition.
8. device as claimed in claim 7, it is characterised in that:
The abstraction module is carried out to each data partition using the thread of respective numbers respectively according to the Thread Count distributed Data pick-up refers to:
Respectively according to the Thread Count for each data child partition for distributing to each data partition, using the thread of respective numbers to each Data child partition carries out data pick-up.
CN201410406481.6A 2014-08-18 2014-08-18 A kind of data pick-up method and device Active CN104182502B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410406481.6A CN104182502B (en) 2014-08-18 2014-08-18 A kind of data pick-up method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410406481.6A CN104182502B (en) 2014-08-18 2014-08-18 A kind of data pick-up method and device

Publications (2)

Publication Number Publication Date
CN104182502A CN104182502A (en) 2014-12-03
CN104182502B true CN104182502B (en) 2017-10-27

Family

ID=51963541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410406481.6A Active CN104182502B (en) 2014-08-18 2014-08-18 A kind of data pick-up method and device

Country Status (1)

Country Link
CN (1) CN104182502B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915414A (en) * 2015-06-04 2015-09-16 北京京东尚科信息技术有限公司 Data extraction method and device
CN106708620A (en) * 2015-11-13 2017-05-24 苏宁云商集团股份有限公司 Data processing method and system
CN105468725B (en) * 2015-11-20 2019-03-08 北京京东尚科信息技术有限公司 Table segmenting extraction system and method in a kind of relevant database
CN107045512B (en) * 2016-02-05 2020-11-24 北京京东尚科信息技术有限公司 Data exchange method and system
US10579434B2 (en) 2016-08-24 2020-03-03 Improbable Worlds Ltd Simulation systems and methods using query-based interest
US10303821B2 (en) 2016-08-24 2019-05-28 Improbable Worlds Ltd. Load balancing systems and methods for spatially-optimized simulations
CN106777933B (en) * 2016-12-02 2019-05-10 郑州云海信息技术有限公司 A kind of collecting method, apparatus and system
CN107688907B (en) * 2017-09-05 2022-01-18 江苏电力信息技术有限公司 Material sampling inspection method based on queue layering processing mechanism
CN108062399A (en) * 2017-12-21 2018-05-22 新华三大数据技术有限公司 Data processing method and device
CN108664567B (en) * 2018-04-24 2022-03-04 中国银行股份有限公司 Data acquisition method and system based on data table partition
CN108984738A (en) * 2018-07-16 2018-12-11 中国银行股份有限公司 A kind of data shop fixtures method and device
CN110851266A (en) * 2018-08-03 2020-02-28 奇异世界有限公司 Load balancing through partitions and virtual processes
CN109325015B (en) * 2018-08-31 2021-07-20 创新先进技术有限公司 Method and device for extracting characteristic field of domain model
CN110032559A (en) * 2019-04-19 2019-07-19 成都四方伟业软件股份有限公司 A kind of data pick-up method and device
CN110597618B (en) * 2019-07-26 2022-06-07 苏宁云计算有限公司 Task splitting method and device of data exchange system
CN111241171A (en) * 2019-10-28 2020-06-05 杭州美创科技有限公司 Full-amount data extraction method for database
CN116163754B (en) * 2022-12-08 2023-11-21 南京坤拓土木工程科技有限公司 Tunneling parameter sample preprocessing method based on power distribution hierarchical sampling

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1897025A (en) * 2006-04-27 2007-01-17 南京联创科技股份有限公司 Parallel ETL technology of multi-thread working pack in mass data process
CN101329676A (en) * 2007-06-20 2008-12-24 华为技术有限公司 Data paralleling abstracting method and apparatus and database system
CN101882165A (en) * 2010-08-02 2010-11-10 山东中创软件工程股份有限公司 Multithreading data processing method based on ETL (Extract Transform Loading)
CN102033948A (en) * 2010-12-22 2011-04-27 中国农业银行股份有限公司 Method and device for updating data
CN103955491A (en) * 2014-04-15 2014-07-30 南威软件股份有限公司 Method for synchronizing timing data increment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7979473B2 (en) * 2005-10-07 2011-07-12 Hitachi, Ltd. Association rule extraction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1897025A (en) * 2006-04-27 2007-01-17 南京联创科技股份有限公司 Parallel ETL technology of multi-thread working pack in mass data process
CN101329676A (en) * 2007-06-20 2008-12-24 华为技术有限公司 Data paralleling abstracting method and apparatus and database system
CN101882165A (en) * 2010-08-02 2010-11-10 山东中创软件工程股份有限公司 Multithreading data processing method based on ETL (Extract Transform Loading)
CN102033948A (en) * 2010-12-22 2011-04-27 中国农业银行股份有限公司 Method and device for updating data
CN103955491A (en) * 2014-04-15 2014-07-30 南威软件股份有限公司 Method for synchronizing timing data increment

Also Published As

Publication number Publication date
CN104182502A (en) 2014-12-03

Similar Documents

Publication Publication Date Title
CN104182502B (en) A kind of data pick-up method and device
CN103577590A (en) Data query method and system
CN103186566B (en) A kind of data classification storage, apparatus and system
CN105718565B (en) The construction method and construction device of data warehouse model
US10127252B2 (en) History and scenario data tracking
CN104111936B (en) Data query method and system
CN111712809A (en) Learning ETL rules by example
CN103246733A (en) Dynamic form system based on metadata and generation method thereof
CN101673287A (en) SQL sentence generation method and system
CN105405053A (en) Artificial intelligent adjustment system
US20150019303A1 (en) Data quality integration
CN105159971B (en) A kind of cloud platform data retrieval method
CN104504008B (en) A kind of Data Migration algorithm based on nested SQL to HBase
CN110008199A (en) A kind of Data Migration dispositions method based on access temperature
CN110008246A (en) Metadata management method and device
CN107958048A (en) A kind of multi-dimensions database system and implementation method based on financial data analysis
CN102819589A (en) ETL (Extract Transform Load)-based data optimization method and equipment
CN104834746B (en) Heterogeneous characteristic time series data evolution clustering method based on graphics processing unit
CN106649718A (en) Large data acquisition and processing method for PDM system
CN108255852B (en) SQL execution method and device
US11036761B1 (en) Configurable database management
CN105678452A (en) Method and device for fee counting and drawing
CN104408128B (en) A kind of reading optimization method indexed based on B+ trees asynchronous refresh
CN202433952U (en) General network reporting system
CN106709029A (en) File hierarchical processing method and processing system based on Hadoop and MySQL

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant