CN106503058B - A kind of data load method, terminal and computing cluster - Google Patents
A kind of data load method, terminal and computing cluster Download PDFInfo
- Publication number
- CN106503058B CN106503058B CN201610856707.1A CN201610856707A CN106503058B CN 106503058 B CN106503058 B CN 106503058B CN 201610856707 A CN201610856707 A CN 201610856707A CN 106503058 B CN106503058 B CN 106503058B
- Authority
- CN
- China
- Prior art keywords
- data
- subregion
- cluster
- computing cluster
- file destination
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present invention provides a kind of data load method, terminal and computing cluster, is related to field of communication technology, can reduce the read-write time delay of KeyValue database, improves the query performance of KeyValue database.The specific scheme is that computing cluster receives the data load requests for carrying the partition information of tables of data to be loaded;Determine that the first data subregion, all subregions of partition information instruction bind a first data subregion respectively according to partition information;The source data for obtaining each subregion of partition information instruction respectively, executes mapping tasks to the source data of each subregion respectively;According to the binding relationship of the subregion of partition information instruction and the first data subregion, the resulting intermediate data of each mapping tasks will be executed, the first data subregion is accordingly written;Reduction task is executed respectively to the intermediate data in each first data subregion, execution obtains the file destination of each reduction task, and file destination carries out data query use for the load tables of data of KeyValue database.The embodiment of the present invention is for loading data.
Description
Technical field
The present embodiments relate to field of communication technology more particularly to a kind of data load methods, terminal and computing cluster.
Background technique
Distributed key assignments KeyValue database can effectively reduce the number of read-write disk, have better readwrite performance,
Better data query service can be provided for user.KeyValue database is frequently with mapping reduction MapReduce service group
Part loads data in batches.During batch loads data, by executing MapReduce task, generation and KeyValue
The consistent file destination of the file memory format of the definition of database, and store into distributed file system, then from point
Cloth file system is loaded into KeyValue database.
Wherein, while arrangement has the structural schematic diagram of the cluster of MapReduce serviced component and KeyValue database can
Referring to Fig. 1.In cluster shown in Fig. 1, MapReduce task execution process needs to read a large amount of data, and is related to
Calculated to sequence, subregion etc. are a large amount of so that entire cluster central processing unit (Central Processing Unit,
Abbreviation CPU), network inputs/output (Input/Output, abbreviation I/O) mouth, the utilization rate of the resources such as magnetic disc i/o mouth it is very high.
KeyValue database is higher to read-write delay requirement, generally in Millisecond;But it is being using MapReduce serviced component
When KeyValue database batch load data, more resource can be occupied by executing MapReduce task to load the process of data,
Cause the resource that can be used for the process of the query service of KeyValue database is opposite to reduce, to influence KeyValue database
Read-write time delay, the data query reduced performance of KeyValue database leads to not the business demand for meeting user.
Summary of the invention
The embodiment of the present invention provides a kind of data load method, terminal and computing cluster, can reduce KeyValue data
The read-write time delay in library improves the query performance of KeyValue database.
In order to achieve the above objectives, the embodiment of the present invention adopts the following technical scheme that
In a first aspect, the embodiment of the present invention provides a kind of data load method, it is applied to computing cluster.It is directed to inquire
Cluster, computing cluster are loaded for data, and inquiry cluster is used for the data query of KeyValue database, computing cluster and inquiry
Cluster is different clusters.This method comprises: firstly, computing cluster receive data load requests, data load requests carry to
Load the partition information of tables of data.Secondly, computing cluster determines the first data subregion according to partition information.Wherein, partition information
All subregions indicated bind a first data subregion respectively.Then, computing cluster obtains respectively from distributed file system
The source data for each subregion for taking partition information to indicate, executes mapping tasks to the source data of each subregion respectively.Later, it calculates
The binding relationship of subregion and the first data subregion that cluster is indicated according to partition information, will execute each mapping tasks it is resulting in
Between data the first data subregion is accordingly written.Then, computing cluster distinguishes the intermediate data in each first data subregion
Reduction task is executed, execution obtains the file destination of each reduction task, load of the file destination for KeyValue database
Tables of data carries out data query use.
In this way, can execute MapReduce task by the resource in computing cluster, it is fixed with KeyValue database to generate
The identical file destination of file memory format of justice, so that the load tables of data of the KeyValue database in inquiry cluster carries out
Data query uses.Wherein, since execute MapReduce task is computing cluster, looking into for query service is provided with for user
It askes cluster and is independent from each other two clusters, thus even if can be occupied a large amount of during executing MapReduce task
The resources such as CPU, I/O mouthfuls, but these resources are the resource in computing cluster, and the execution of MapReduce task will not occupy inquiry
The related resource of cluster, so that the load of inquiry cluster is lower, it is thus possible to reduce KeyValue in inquiry cluster
The read-write time delay of database improves the query performance of KeyValue database.
In a kind of possible implementation of first aspect, this method further include: computing cluster sends file destination
To inquiry cluster.
Second aspect, the embodiment of the present invention provide a kind of data load method, are applied to terminal.It is directed to computing cluster
With inquiry cluster, computing cluster is loaded for data, and inquiry cluster is used for the data query of KeyValue database, computing cluster
It is different clusters from inquiry cluster.This includes: terminal to computing cluster transmission data load requests, and data load requests carry
The partition information of tables of data to be loaded.Wherein, data load requests indicate that computing cluster determines the first data according to partition information
Subregion.All subregions of partition information instruction bind a first data subregion respectively.First data subregion for store for
Source data in the subregion of first data partition bindings executes the resulting intermediate data of mapping tasks, so as to the first data subregion
In intermediate data execute reduction task and obtain file destination.
In a kind of possible implementation of second aspect, inquiry cluster and computing cluster have respective distributed text
Part system, it is mutually isolated to inquire the distributed file system that cluster and computing cluster respectively have, in this case, terminal or
Inquiry cluster need to request computing cluster that the corresponding file destination of each first data subregion is sent to inquiry cluster, so as to
The tables of data to be loaded of KeyValue database carries out using file destination when data query.
In a kind of possible implementation of second aspect, inquiry cluster and computing cluster share distributed field system
System, inquiry cluster obtain file destination from distributed file system.
After computing cluster generates file destination, file destination can be stored in the distribution text shared with inquiry cluster
In part system, inquiry cluster can obtain file destination and be loaded directly from distributed file system, so as in KeyValue
The tables of data to be loaded of database carries out using the file destination when data query.
In a kind of possible implementation of second aspect, terminal to computing cluster send data load requests it
Before, this method further include: terminal requests the partition information of tables of data to be loaded to inquiry cluster.
To which terminal can determine the first data subregion according to the partition information obtained from inquiry cluster.
In a kind of possible implementation of second aspect, the connection config set and meter of inquiry cluster are preserved in terminal
Calculate the connection config set of cluster.Before terminal requests the partition information of tables of data to be loaded to inquiry cluster, this method is also wrapped
Include: terminal establishes connection request to query set pocket transmission first according to the connection config set of inquiry cluster.Collect in terminal to calculating
Before pocket transmission data load requests, this method further include: terminal is according to the connection config set of computing cluster to institute's computing cluster
It sends second and establishes connection request.
To which after terminal and inquiry cluster/computing cluster establish connection, terminal can be with inquiry cluster/computing cluster
Carry out interacting message.
In a kind of possible implementation of second aspect, connection config set includes IP address, port and secure access
At least one of configuration information.
The third aspect, the embodiment of the present invention provide a kind of data load method, are applied to inquiry cluster.It is directed to calculate
Cluster, inquiry cluster are used for the data query of KeyValue database, and computing cluster is loaded for data, computing cluster and inquiry
Cluster is different clusters.This method comprises: inquiry cluster receives each of the computing cluster transmission corresponding mesh of the first data subregion
Mark file.Then, the corresponding file destination of each first data subregion is loaded onto KeyValue database by inquiry cluster, with
Just file destination is used when the tables of data to be loaded of KeyValue database carries out data query.
In this way, terminal or inquiry cluster can request computing cluster by target after computing cluster generates file destination
File is sent to inquiry cluster;Inquiry cluster can add file destination after the file destination for receiving computing cluster transmission
It is loaded onto KeyValue database, to use the mesh when the tables of data to be loaded of KeyValue database carries out data query
Mark file.
In conjunction with any of the above-described aspect, in one possible implementation, the Key value of each subregion of partition information instruction
Range is different;For subregion and the first data subregion with binding relationship, in the source data of subregion and the first data subregion
Between data Key value range having the same.
In this way, computing cluster can obtain each subregion according to Key value range when executing mapping reduction task respectively
Source data, and the intermediate data of identical Key range is distributed in the corresponding first data subregion of each subregion.
In conjunction with any of the above-described aspect, in one possible implementation, inquiring has the second data subregion, institute in cluster
There is the corresponding Key value range of the second data subregion Key value range corresponding with the first data subregion identical, the second data subregion is used
In the file destination for storing corresponding Key value range.
In this way, inquiry cluster receive computing cluster transmission file destination after, or from computing cluster share
In distributed file system obtain file destination after, can will file destination corresponding with each first data file, store respectively
In a corresponding second data subregion, and the corresponding Key value range of the second data subregion is corresponding with the first data subregion
Key value range is identical.
In conjunction with any of the above-described aspect, in one possible implementation, subregion instruction information is used to indicate inquiry cluster
KeyValue database in, the corresponding pass of the corresponding Key value range of tables of data to be loaded and M target the second data subregion
System.Also, target the first data subregion and its corresponding target the second data subregion correspond to identical Key value range.
Fourth aspect, the embodiment of the present invention provide a kind of computing cluster, comprising: receiving module loads for receiving data
Request, data load requests carry the partition information of tables of data to be loaded.Determining module, for determining according to partition information
One data subregion.Wherein, all subregions of partition information instruction bind a first data subregion respectively.Execution module is used for
The source data for obtaining each subregion of partition information instruction respectively from distributed file system, to the source data point of each subregion
It Zhi Hang not mapping tasks.Writing module, the binding relationship of subregion and the first data subregion for being indicated according to partition information will
It executes the resulting intermediate data of each mapping tasks and the first data subregion is accordingly written.Execution module is also used to, to each
Intermediate data in one data subregion executes reduction task respectively, and execution obtains the file destination of each reduction task, target text
Part carries out data query use for inquiring the load tables of data of the KeyValue database of cluster.
In a kind of possible implementation of fourth aspect, the Key value range of each subregion of partition information instruction is not
Together.For subregion and the first data subregion with binding relationship, the intermediate data of the source data of subregion and the first data subregion
Key value range having the same.
In a kind of possible implementation of fourth aspect, computing cluster further include: sending module, for target is literary
Part is sent to inquiry cluster.
In a kind of possible implementation of fourth aspect, inquiring has the second data subregion in cluster, and all second
The corresponding Key value range of data subregion Key value range corresponding with the first data subregion is identical, and the second data subregion is for storing
The file destination of corresponding Key value range.
5th aspect, the embodiment of the present invention provide a kind of terminal, comprising: sending module, for sending number to computing cluster
According to load request.Data load requests carry the partition information of tables of data to be loaded.Data load requests indicate computing cluster
The first data subregion is determined according to partition information.All subregions of partition information instruction bind a first data subregion respectively.
First data subregion be used to store the source data execution mapping tasks in subregion for the first data partition bindings it is resulting in
Between data, obtain file destination to execute reduction task to the intermediate data in the first data subregion.Request module is used for
Request computing cluster that the corresponding file destination of each first data subregion is sent to inquiry cluster, so as in inquiry cluster
The tables of data to be loaded of KeyValue database carries out using file destination when data query.
In a kind of possible implementation of the 5th aspect, inquiring has the second data subregion in cluster, and all second
The corresponding Key value range of data subregion Key value range corresponding with the first data subregion is identical, and the second data subregion is for storing
The file destination of corresponding Key value range.
In a kind of possible implementation of the 5th aspect, request module is also used to: to be added to inquiry cluster request
Before the partition information for carrying tables of data, the partition information of tables of data to be loaded is requested to inquiry cluster.
6th aspect, the embodiment of the present invention provide a kind of inquiry cluster, comprising: receiving module, for receiving computing cluster
Each of the transmission corresponding file destination of the first data subregion.Loading module is used for the corresponding mesh of each first data subregion
Mark file is loaded onto KeyValue database, so as to when the tables of data to be loaded of KeyValue database carries out data query
Use file destination.
In a kind of possible implementation of the 6th aspect, inquiring has the second data subregion in cluster, and all second
The corresponding Key value range of data subregion Key value range corresponding with the first data subregion is identical, and the second data subregion is for storing
The file destination of corresponding Key value range.
Another aspect, the embodiment of the invention provides a kind of computing cluster, including multiple calculate nodes, multiple calculate nodes
In a calculate node execute the data load side that any possible implementation of first aspect or first aspect provides
Data interaction is carried out between at least two calculate nodes in method or multiple calculate nodes to execute first aspect or first
The data load method that any possible implementation of aspect provides.
In another aspect, the embodiment of the invention provides a kind of computer storage medium, for being stored as above-mentioned computing cluster
Computer software instructions used, it includes for executing any possible reality for realizing above-mentioned first aspect or first aspect
Program designed by the data load method that existing mode provides.
Another aspect, the embodiment of the invention provides a kind of terminals, including at least one processor, memory and communication connect
Mouthful;At least one described processor, the memory and the communication interface pass through bus and connect;The memory, is used for
Store computer executed instructions;At least one described processor, for executing the computer executed instructions of the memory storage,
So that the computing terminal carries out data interaction by the communication interface and computing cluster and/or inquiry cluster, on executing
The data load method of embodiment offer is provided.
In another aspect, the embodiment of the invention provides a kind of computer storage medium, for being stored as used in above-mentioned terminal
Computer software instructions, include for executing any possible implementation for realizing above-mentioned second aspect or second aspect
Program designed by the data load method of offer.
Another aspect, the embodiment of the invention provides a kind of inquiry cluster, including multiple queries node, multiple queries nodes
In a query node execute the data load side that any possible implementation of first aspect or first aspect provides
Method perhaps carries out data interaction to execute first aspect or between at least two query nodes in multiple queries node
The data load method that any possible implementation of one side provides.
In another aspect, the embodiment of the invention provides a kind of computer storage medium, for being stored as above-mentioned inquiry cluster
Computer software instructions used, comprising for executing any possible realization for realizing the above-mentioned third aspect or the third aspect
Program designed by the data load method that mode provides.
In another aspect, the embodiment of the invention provides a kind of communication system, terminal, calculating collection including the description of above-mentioned aspect
Group and inquiry cluster.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be in embodiment or description of the prior art
Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the invention
Example is applied, it for those of ordinary skill in the art, without creative efforts, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 is a kind of structural schematic diagram of cluster in the prior art;
Fig. 2 is a kind of system architecture schematic diagram provided in an embodiment of the present invention;
Fig. 3 is another system architecture schematic diagram provided in an embodiment of the present invention;
Fig. 4 is a kind of data load method flow chart provided in an embodiment of the present invention;
Fig. 5 is another data load method flow chart provided in an embodiment of the present invention;
Fig. 6 is another data load method flow chart provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of computing cluster provided in an embodiment of the present invention;
Fig. 8 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention;
Fig. 9 is a kind of structural schematic diagram for inquiring cluster provided in an embodiment of the present invention;
Figure 10 is a kind of structural schematic diagram for calculating equipment provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description.Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Using MapReduce serviced component is that system architecture involved in KeyValue database batch load data can be with
As shown in Fig. 2, the system can specifically include cluster and multiple terminals.Wherein, which is that batch can be submitted to add to cluster
The equipment for carrying the requests such as data task and query task, such as can be desktop computer, laptop, iPad, intelligent hand
Machine etc..The cluster may include multiple node devices, which can be the calculating equipment with computing capability;The cluster
Arrangement has MapReduce serviced component and KeyValue database and distributed file system simultaneously.MapReduce service
Component computation capability with high performance can be KeyValue database batch load data.KeyValue database
The read-write requests of terminal can be responded, provide query service for terminal user.Distributed file system can be KeyValue number
It stores and supports according to the bottom that library provides high reliability.Illustratively, which specifically can be Hadoop cluster, distribution text
Part system specifically can be HDFS (Hadoop Distributed File System), which specifically can be with
It is HBase.
Wherein, when being KeyValue database batch load data using MapReduce serviced component, terminal can be to
Cluster submits a data load requests, and a management node in MapReduce serviced component receives data load requests
Afterwards, MapReduce task is executed.Specifically, MapReduce task includes Map task and Reduce task, wherein execute Map
The stage of task may include the Shuffle stage, and the stage for executing Reduce task may include the Sort stage.Cluster executes
Map task reads source data, and is parsed to obtain intermediate data<Key, Value>right to source data;Map task will be executed again
Parsing obtain<Key, Value>right, write in data subregion Partition in the Shuffle stage according to key, to execute
Data are obtained from the Partition when Reduce task.It is alternatively possible to right first when executing Reduce task
In Partition<Key, Value>to progress Sort processing.The corresponding data subregion of each Reduce task
Partition, each Partition correspond to a data subregion Region in KeyValue database.Each Reduce appoints
Business generates the file destination of corresponding Partition.Wherein, the file destination that Reduce phased mission generates is for KeyValue data
The query service in library uses, therefore the file destination that Reduce task generates meets the file storage that KeyValue database defines
Format.Then, the file destination that Reduce task generates is loaded onto KeyValue data from distributed file system by cluster
In library, so that inquiry uses.
In system architecture shown in Fig. 2, due to MapReduce serviced component execute MapReduce task process with
The process for the query service that KeyValue database executes is located in same cluster, and MapReduce serviced component is executing batch
It during loading data task, needs to read a large amount of data, and is related to a large amount of calculating such as sequence and subregion, so that
The load of entire cluster is very big, and resource utilization is very high, to greatly affected the reading of KeyValue database in cluster
Time delay is write, the query performance of KeyValue database is reduced.For this problem, the embodiment of the invention provides a kind of data to add
Support method, terminal and computing system are different by the way that MapReduce serviced component and KeyValue database to be separately positioned on
In cluster, to reduce the load of KeyValue database place cluster, the read-write time delay of KeyValue database is reduced, is improved
The query performance of KeyValue database;MapReduce serviced component can obtain enough resources to execute simultaneously
MapReduce task improves the execution efficiency of MapReduce task.
As shown in figure 3, system architecture involved in data load method provided in an embodiment of the present invention may include inquiry
Cluster and the different clusters of computing cluster two and terminal may each comprise multiple node devices in each cluster, which sets
It is standby to can be the calculating equipment with computing capability.Inquiry cluster arrangement has KeyValue database and distributed file system,
Query service can be provided for user.Illustratively, the KeyValue database specifically can be Google Bigtable,
Apache HBase or Apache Cassandra etc..Computing cluster arrangement has MapReduce serviced component and distributed document
System can preserve source data file and execute MapReduce task, be KeyValue database batch load data.Its
In, the distributed file system in distributed file system and computing cluster in cluster is inquired, can be two independently
Distributed file system, be also possible to the shared same distributed file system of two clusters, be not especially limited here.
Based on system architecture shown in Fig. 3, the embodiment of the present invention provides a kind of data load method, referring to fig. 4, this method
May include:
101, terminal sends data load requests to computing cluster, and data load requests carry point of tables of data to be loaded
Area's information, data load requests indicate that computing cluster determines the first data subregion, the institute of partition information instruction according to partition information
There is subregion to bind a first data subregion respectively, the first data subregion is used to store the subregion for the first data partition bindings
In source data execute the resulting intermediate data of mapping tasks, so as to in the first data subregion intermediate data execute reduction appoint
Business is to obtain file destination.
In system architecture shown in Fig. 3, computing cluster arrangement has MapReduce serviced component, and terminal can be to calculating
Collect pocket transmission data load requests, to request to execute MapReduce task using every resource in computing cluster, thus
Data load is carried out after the completion of MapReduce task execution.
Wherein, the partition information of tables of data to be loaded is carried in data load requests, subregion instruction information is for referring to
Show at least one subregion.The data load requests can indicate computing cluster according to partition information therein, and determining and subregion is believed
The first data subregion that the subregion of breath instruction is bound one by one.First data subregion can be used for storing, and tie up with the first data subregion
Source data in fixed subregion executes resulting intermediate data after mapping tasks, so that computing cluster can be to the first data
Intermediate data in subregion executes reduction task, and then obtains file destination.Specifically, the first data subregion can be Fig. 3 institute
Show the data subregion Partition in computing cluster.
102, computing cluster determines the first data according to partition information after the data load requests for receiving terminal transmission
Subregion.
Computing cluster, can basis after receiving the data load requests for carrying subregion instruction information of terminal transmission
The first data subregion that the determining subregion with partition information instruction of partition information is bound one by one.
Illustratively, when the subregion of subregion instruction information instruction is 3, the first determining data subregion of computing cluster
It is 3, the first data subregion A (Partition A), the first data subregion B being specifically as follows in system shown in Figure 3 framework
(Partition B) and the first data subregion C (Partition C).
103, computing cluster obtains the source data of each subregion of partition information instruction respectively from distributed file system,
Mapping tasks are executed respectively to the source data of each subregion.
Computing cluster executes MapReduce task after receiving data load requests.Specifically, computing cluster can be with
Obtain the source data of each subregion of partition information instruction respectively first from distributed file system, and to the source of each subregion
Data execute mapping Map task respectively, obtain intermediate data<Key, Value>right.
104, the binding relationship for the subregion and the first data subregion that computing cluster is indicated according to partition information, it is each by executing
The first data subregion is accordingly written in the resulting intermediate data of mapping tasks.
Wherein, the stage that computing cluster executes mapping Map task may include the Shuffle stage.Computing cluster is executing
Map task parses to obtain<Key, Value>to rear, can be in the Shuffle stage, according to key by the source data to each subregion
Mapping Map task resulting intermediate data<Key, Value>right are executed, corresponding first data point of each subregion are accordingly write
In area Partition.
105, computing cluster executes reduction task to the intermediate data in each first data subregion respectively, and execution obtains every
The file destination of a reduction task, file destination carry out data query use for the load tables of data in KeyValue database.
After it will execute the resulting intermediate data of each mapping Map task and the first data subregion is accordingly written, calculate
Cluster can execute reduction Reduce task to intermediate data<Key in each first data subregion, Value>right respectively, from
And obtain the corresponding file destination of reduction Reduce task of each first data subregion.
Wherein, the format of the file destination follows the file memory format that KeyValue database defines, so as to supply
The load tables of data of KeyValue database carries out data query use.Illustratively, when KeyValue database is HBase
When, file destination can be HFile format.
As it can be seen that data load method provided in an embodiment of the present invention, executes MapReduce by the resource in computing cluster
Task generates file destination identical with the file memory format that KeyValue database defines, in inquiry cluster
The load tables of data of KeyValue database carries out data query use.Wherein, since execute MapReduce task is to calculate
Cluster is independent from each other two clusters with the inquiry cluster for providing query service for user, thus even if executing
During MapReduce task, the resources such as CPU, I/O mouthful a large amount of can be occupied, but these resources are the money in computing cluster
Source, the related resource without occupying inquiry cluster reduce in inquiry cluster so that the load of inquiry cluster is lower
The read-write time delay of KeyValue database improves the query performance of KeyValue database.
That is, data load method provided in an embodiment of the present invention, by by MapReduce serviced component and
KeyValue database is deployed in respectively in different clusters, can be to avoid the MapReduce task process for occupying vast resources
Influence to query service process, so that the load of cluster where reducing KeyValue database, improves KeyValue database
Query performance.
In addition, data load requests can also carry source data file store path and output in above-mentioned steps 101
Path.Computing cluster can store from source data file in step 103 and obtain source data under road, and generate in step 105
After file destination corresponding with each first data subregion, these file destinations are stored under outgoing route.
It should be noted that file destination after generating file destination, specifically can be stored in this earth magnetism by computing cluster
In disk, file destination can also be stored in distributed file system.Also, the distributed file system of computing cluster can be with
It is the mutually independent distributed file system of distributed file system with inquiry cluster, is also possible to the distribution with inquiry cluster
The shared same distributed file system of formula file system.
On the one hand, computing cluster by file destination be stored in local disk or with inquiry cluster distributed field system
In the case where in mutually independent distributed file system of uniting, referring to Fig. 5, after step 105, this method can also include:
106, the corresponding file destination of each first data subregion is sent to inquiry cluster by terminal request computing cluster, with
Just file destination is used when the tables of data to be loaded of KeyValue database carries out data query.
107, the corresponding file destination of each first data subregion is sent to inquiry cluster by computing cluster.
108, inquiry cluster receives each of the computing cluster transmission corresponding file destination of the first data subregion.
109, the corresponding file destination of each first data subregion is loaded onto KeyValue database by inquiry cluster, with
Just file destination is used when the tables of data to be loaded of KeyValue database carries out data query.
In the case of this kind, terminal can request computing cluster will be in the local disk of computing cluster or distributed file system
Each of the preservation corresponding file destination of the first data subregion, is sent to inquiry cluster;Inquiry cluster is receiving computing cluster
After each of the transmission corresponding file destination of the first data subregion, the corresponding file destination of each first data subregion can be protected
There are local disks, or are stored in the distributed file system (distributed field system with computing cluster that inquiry cluster uses
Unite independent distributed file system) in, so as to which the file destination in local disk or distributed file system is loaded
Into KeyValue database, for KeyValue database tables of data to be loaded carry out data query when use.
On the other hand, file destination is stored in and shared same of the distributed file system of inquiry cluster in computing cluster
In the case where in one distributed file system, computing cluster does not need for file destination to be sent to inquiry cluster.In step 105
Later, this method can also include:
110, inquiry cluster obtains file destination from distributed file system.
After step 105, inquiry cluster can be obtained directly from the distributed file system shared with computing cluster
File destination carries out data load, so as to when the tables of data to be loaded of KeyValue database carries out data query using obtaining
The file destination taken.Wherein, inquiry cluster specifically can be according to partition information, from the distributed field system shared with computing cluster
The corresponding file destination of each first data subregion is obtained in system.
Further, referring to Fig. 6, before above-mentioned steps 101, this method can also include:
111, terminal requests the partition information of tables of data to be loaded to inquiry cluster.
Wherein, the partition information for the tables of data to be loaded that terminal is requested to inquiry cluster is used to indicate tables of data pair to be loaded
At least one subregion answered.The representation of partition information can there are many, the embodiment of the present invention does not limit its concrete form
It is fixed.
In KeyValue database, the data in tables of data to be loaded correspond to a Key value range, tables of data to be loaded
At least one subregion can be divided into according to Key value range, the Key value range of each subregion of partition information instruction is different.Its
In, Key is a keyword, specifically can be a field, attribute or feature in tables of data to be loaded.
Illustratively, the tables of data to be loaded in KeyValue database is " user message table ", " user message table " tool
Body includes " identity ", " name ", " phone " and " address " 4 fields, and the range of " identity " of user is
00000000-29999999.The specific format for being somebody's turn to do " user data table " may refer to such as the following table 1:
Table 1
Identity | Name | Phone | Address |
00000000 | … | … | … |
00000001 | … | … | … |
00000002 | … | … | … |
… | … | … | … |
29999999 | … | … | … |
In the tables of data to be loaded shown in the table 1, if key is " identity " this field, tables of data pair to be loaded
The Key value range answered is 00000000-29999999.Tables of data to be loaded can divide subregion according to separation Key value.Example
Such as, when separation Key value is 10000000 and 20000000, tables of data to be loaded can be divided into 3 subregions: with Key value
The corresponding subregion 1 of range 00000000-09999999, subregion 2 corresponding with Key value range 10000000-19999999, and
Subregion 3 corresponding with Key value range 20000000-29999999.
In this example, subregion instruction information can be the corresponding Key range 00000000- of tables of data to be loaded
29999999 and separation Key value 10000000 and 20000000.The subregion indicates that information indicates that tables of data to be loaded is corresponding
3 subregions, and the corresponding Key value range of subregion 1 is 00000000-09999999, the corresponding Key value range of subregion 2 is
10000000-19999999, the corresponding Key value range of subregion 3 are 20000000-29999999.
When the Key value range difference of each subregion of partition information instruction, for the subregion and the with binding relationship
One data subregion, the intermediate data Key value range having the same of the source data of subregion and the first data subregion.
Illustratively, when the corresponding Key value range of subregion 1 is 00000000-09999999, the corresponding Key value model of subregion 2
It encloses for 10000000-19999999, when the corresponding Key value range of subregion 3 is 20000000-29999999, if what subregion 1 was bound
First data subregion is Partition A, and the first data subregion that subregion 2 is bound is Partition B, subregion 3 bind the
One data subregion is Partition C, then: the source data of Partition A and the intermediate data of Partition A are corresponding
Key value range 00000000-09999999;The source data of Partition B and the intermediate data of Partition B are corresponding
Key value range 10000000-19999999;The source data of Partition C and the intermediate data of Partition C are corresponding
Key value range 20000000-29999999.
To which in step 103, computing cluster can obtain source corresponding with subregion 1 number from distributed file system
According to the Key value range of source data is 00000000-09999999;Also, at step 104, computing cluster can reflect execution
The intermediate data of Key value range that Map task obtains between 00000000-09999999 is penetrated, Partition is accordingly written
In A.
Similarly, computing cluster can obtain source data corresponding with subregion 2 from distributed file system, source data
Key value range is 10000000-19999999;Also, at step 104, computing cluster can will execute mapping Map task and obtain
Intermediate data of the Key value range arrived between 10000000-19999999 is accordingly written in Partition B.
Also, computing cluster can obtain source data corresponding with subregion 3 from distributed file system, source data
Key value range is 20000000-29999999;Also, at step 104, computing cluster can will execute mapping Map task and obtain
Intermediate data of the Key value range arrived between 20000000-29999999 is accordingly written in Partition C.
Further, the second data subregion can also be had by inquiring in cluster, the corresponding Key value of all second data subregions
Range Key value range corresponding with the first data subregion is identical, and the second data subregion is used to store the target of corresponding Key value range
File.
On this basis, it is corresponding to receive the first data subregion of each of computing cluster transmission in step 108 for inquiry cluster
File destination after, can also include: inquiry cluster will the corresponding file destination of each first data subregion, respectively save extremely with
Each first data subregion corresponds in the second data subregion of identical Key value range, so that each second data subregion is corresponding
File destination be loaded onto KeyValue database, so as to the tables of data to be loaded of KeyValue database carry out data look into
The file destination is used when inquiry.
Cluster is inquired in step 110 directly from the distributed file system shared with computing cluster, obtains each the
After the corresponding file destination of one data subregion, the corresponding file destination of each first data subregion can also be saved respectively
Extremely and in the second data subregion of the corresponding identical Key value range of each first data subregion, thus by each second data subregion
Corresponding file destination is loaded onto KeyValue database, so that the tables of data to be loaded in KeyValue database is counted
It is investigated that using the file destination when asking.
Illustratively, the second data subregion inquired in cluster can be Region as shown in Figure 3.Wherein, Region1
Can identical Key value range 00000000-09999999 corresponding with Partition A, Region1 can be used for storing correspondence
The file destination of Key value range 00000000-09999999.Region2 can identical Key value corresponding with Partition B
Range 10000000-19999999, Region2 can be used for storing the target of corresponding Key value range 10000000-19999999
File.Region3 can identical Key value range 20000000-29999999, Region3 corresponding with Partition C can be with
For storing the file destination of corresponding Key value range 20000000-29999999.
In step 108 or 110, inquiry cluster can be by the corresponding target text of Key value range 00000000-09999999
Part is stored into the second data subregion Region1;By the corresponding file destination of Key value range 10000000-19999999, deposit
Storage is into the second data subregion Region2;The corresponding file destination of Key value range 20000000-29999999 is stored to
In two data subregion Region3, to use the mesh when the tables of data to be loaded of KeyValue database carries out data query
Mark file.
In addition, when terminal requests the partition information of tables of data to be loaded to inquiry cluster, it can also be by tables of data to be loaded
Identification information be sent to inquiry cluster so that inquiry cluster can according to the identification information of tables of data to be loaded, determine to
Load the partition information of tables of data and tables of data to be loaded.Wherein, the identification information of tables of data to be loaded is used to indicate to be added
Tables of data is carried, such as can be table name, the number etc. of tables of data to be loaded, is not specifically limited here.
In addition, the connection config set of the connection config set and computing cluster of inquiry cluster can also be preserved in terminal.Its
In, connection config set establishes the configuration information needed when connection for saving terminal and inquiry cluster/computing cluster, and the present invention is real
Example is applied to be not specifically limited the particular content of connection config set.Illustratively, which may include network protocol
At least one of IP address, port and secure access configuration information.Wherein, the IP address connected in config set can be inquiry
The IP address of management node in cluster/computing cluster also may include the IP for inquiring all nodes in cluster/computing cluster
Location;Port in connection config set can be to provide the port of related service.
Further, referring to Fig. 6, before above-mentioned steps 111, this method can also include:
112, terminal establishes connection request to query set pocket transmission first according to the connection config set of inquiry cluster.
Before above-mentioned steps 101, this method can also include:
113, terminal sends second to computing cluster according to the connection config set of computing cluster and establishes connection request.
Further, before above-mentioned steps 112, this method can also include:
114, terminal log-on data loading tasks.
Wherein, data loading tasks are directed to the task of the KeyValue database load data in inquiry cluster.Terminal
The concrete mode of log-on data loading tasks can there are many, such as terminal can receive user input triggering command, thus
Start batch data loading tasks;Start batch data loading tasks automatically after terminal booting;Or terminal periodic start
Batch data loading tasks etc., are not specifically limited here.
The embodiment of the present invention provides a kind of computing cluster 700, and referring to Fig. 7, which may include receiving module
701, determining module 702, execution module 703, writing module 704 and sending module 705.Specifically, the computing cluster 700 can be with
Including multiple calculate nodes, which can be the calculating equipment with computing capability;At least one in computing cluster 700
A calculate node is used to dispose the function of each module of computing cluster 700.Wherein, receiving module 701 can be used for, and receives data and adds
Request is carried, data load requests carry the partition information of tables of data to be loaded;Determining module 702 can be used for, according to subregion
Information determines the first data subregion, wherein all subregions of partition information instruction bind a first data subregion respectively;It executes
Module 703 can be used for, and the source data of each subregion of partition information instruction be obtained respectively from distributed file system, to every
The source data of a subregion executes mapping tasks respectively;Writing module 704 can be used for, according to the subregion and the of partition information instruction
The binding relationship of one data subregion will execute the resulting intermediate data of each mapping tasks and the first data subregion is accordingly written;
Execution module 703 can be also used for, and execute reduction task respectively to the intermediate data in each first data subregion, execution obtains
The file destination of each reduction task, file destination carry out data for inquiring the load tables of data of the KeyValue database of cluster
Inquiry uses.
In addition, sending module 705 can be used for executing the step 107 in Fig. 5.Computing cluster 700 in Fig. 7 can be used for
Any process in above method process is executed, this will not be detailed here for the embodiment of the present invention.
The embodiment of the invention also provides a kind of computer storage mediums, for being stored as computing cluster shown in above-mentioned Fig. 7
Computer software instructions used, comprising for executing program designed by above method embodiment.By the institute for executing storage
Program is stated, data load may be implemented.
The embodiment of the present invention provides a kind of terminal 800, and referring to Fig. 8, which may include sending module 801 and ask
Modulus block 802.Wherein, sending module 801 can be used for, and send data load requests to computing cluster, data load requests are taken
Partition information with tables of data to be loaded, data load requests indicate that computing cluster determines the first data point according to partition information
All subregions in area, partition information instruction bind a first data subregion respectively, and the first data subregion is for storing for the
Source data in the subregion of one data partition bindings executes the resulting intermediate data of mapping tasks, so as to in the first data subregion
Intermediate data execute reduction task and obtain file destination.Request module 802 can be used for, and request computing cluster is by each the
The corresponding file destination of one data subregion is sent to inquiry cluster, so as in the to be loaded of the KeyValue database of inquiry cluster
Tables of data carries out using file destination when data query.
In addition, request module 802 can be used for executing the step 111 in Fig. 6.Terminal 800 in Fig. 8 can be used for executing
Any process in above method process, this will not be detailed here for the embodiment of the present invention.
The embodiment of the invention also provides a kind of computer storage mediums, for being stored as used in terminal shown in above-mentioned Fig. 8
Computer software instructions, it includes for executing program designed by above method embodiment.By executing described in storage
Data load may be implemented in program.
The embodiment of the present invention provides a kind of inquiry cluster 900, and referring to Fig. 9, which may include receiving module
901 and loading module 902.Specifically, the inquiry cluster 900 may include multiple queries node, which can be tool
There is the calculating equipment of computing capability;At least one query node inquired in cluster 900 is for disposing the inquiry each module of cluster 900
Function.Wherein, receiving module 901 can be used for, and receive each of the computing cluster transmission corresponding target of the first data subregion
File.Loading module 902 can be used for, and the corresponding file destination of each first data subregion is loaded onto KeyValue database
In, to use file destination when the tables of data to be loaded of KeyValue database carries out data query.In addition, in Fig. 9
Inquiry cluster 900 can be used for executing any process in above method process, and this will not be detailed here for the embodiment of the present invention.
The embodiment of the invention also provides a kind of computer storage mediums, inquire cluster shown in above-mentioned Fig. 9 for being stored as
Computer software instructions used, it includes for executing program designed by above method embodiment.By executing storage
Data load may be implemented in described program.
Referring to Figure 10, the embodiment of the present invention also provides a kind of calculating equipment 1000, and calculating equipment 1000 includes at least one
Processor 1001, memory 1002 and communication interface 1003;At least one described processor 1001, the memory 1002 and institute
Communication interface 1003 is stated to connect by bus 1004;The memory 1002, for storing computer executed instructions;It is described extremely
A few processor 1001, the computer executed instructions stored for executing the memory 1002, so that the calculating equipment
1000 (such as inquire query node in cluster, end by the communication interface 1003 and other equipment with computing capability
Calculate node in end or computing cluster) data interaction is carried out, to execute data load method provided by the above embodiment.
A kind of alternative embodiment, the calculate node that computing cluster provided in an embodiment of the present invention includes are to calculate equipment
1000, the calculating equipment 1000 of the computing cluster is set by the communication interface 1003 with other calculating in computing cluster
Standby 1000, the query node of terminal and inquiry cluster carries out data interaction, to execute data load side provided by the above embodiment
Method.
A kind of alternative embodiment, the query node that inquiry cluster provided in an embodiment of the present invention includes are calculating equipment
1000, the calculating equipment 1000 of the inquiry cluster is set by other calculating in the communication interface 1003, with inquiry cluster
Standby 1000, terminal and the calculate node of computing cluster carry out data interaction to execute data load side provided by the above embodiment
Method.
A kind of alternative embodiment, terminal provided in an embodiment of the present invention are to calculate equipment 1000, the calculating equipment 1000
By the communication interface 1003, data interaction is carried out with the calculate node of computing cluster and the query node of inquiry cluster, is come
Execute data load method provided by the above embodiment.
Optionally, at least one processor 1001 may include different types of processor 1001, or including mutually similar
The processor 1001 of type;Processor 1001 can be below any: central processor CPU, arm processor, scene can compile
Journey gate array (Field Programmable Gate Array, abbreviation FPGA), application specific processor etc. have calculation processing ability
Device.A kind of optional embodiment, at least one described processor 1001 can also be integrated into many-core processor.
Optionally, memory 1002 can be below any or any combination: random access memory (Random
Access Memory, abbreviation RAM), read-only memory (Read Only Memory, abbreviation ROM), nonvolatile memory
(Non-volatile Memory, abbreviation NVM), solid state hard disk (Solid State Drives, abbreviation SSD), mechanical hard disk,
The storage mediums such as disk, disk array.
Optionally, communication interface 1003 is used to calculate equipment 1000 and other setting with computing capability or storage capacity
It is standby to carry out data interaction.Communication interface 1003 can be below any or any combination: network interface (such as Ethernet
Interface), the device with network access facility such as wireless network card.
The bus 1004 may include address bus, data/address bus, control bus etc., for convenient for indicating, Figure 10 is with one
Thick line indicates the bus.Bus 1004 can be below any or any combination: industry standard architecture
(Industry Standard Architecture, abbreviation ISA) bus, peripheral component interconnection (Peripheral
Component Interconnect, abbreviation PCI) bus, expanding the industrial standard structure (Extended Industry
Standard Architecture, abbreviation EISA) wired data transfers such as bus device.
Another embodiment of the present invention provides a kind of communication systems, may include terminal, computing cluster and inquiry cluster, this is logical
The structural schematic diagram of letter system may refer to Fig. 3.Terminal, computing cluster and inquiry cluster in the communication system can execute
State the data load method in embodiment of the method.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, the range for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.
Claims (14)
1. a kind of data load method, which is characterized in that computing cluster is loaded for data, and inquiry cluster is used for KeyValue number
According to the data query in library, the computing cluster is different clusters from the inquiry cluster, which comprises
The computing cluster receives data load requests, and the data load requests carry the subregion letter of tables of data to be loaded
Breath;
The computing cluster determines the first data subregion according to the partition information, wherein the partition information instruction is owned
Subregion binds the first data subregion respectively;
The computing cluster obtains the source data of each subregion of the partition information instruction respectively from distributed file system,
Mapping tasks are executed respectively to the source data of each subregion;
The binding relationship of subregion and the first data subregion that the computing cluster is indicated according to the partition information, will execute
The first data subregion is accordingly written in each resulting intermediate data of mapping tasks;
The computing cluster executes reduction task to the intermediate data in each first data subregion respectively, and execution obtains every
The file destination of a reduction task, the file destination carry out data query for the load tables of data of KeyValue database
It uses.
2. the method according to claim 1, wherein the Key value model of each subregion of partition information instruction
Enclose difference;For with the binding relationship the subregion and the first data subregion, the source data of the subregion and institute
State the intermediate data Key value range having the same of the first data subregion.
3. method according to claim 1 or 2, which is characterized in that the method also includes:
The file destination is sent to the inquiry cluster by the computing cluster.
4. according to the method described in claim 3, owning it is characterized in that, having the second data subregion in the inquiry cluster
The corresponding Key value range of second data subregion Key value range corresponding with the first data subregion is identical, and described second
Data subregion is used to store the file destination of corresponding Key value range.
5. a kind of data load method, which is characterized in that computing cluster is loaded for data, and inquiry cluster is used for KeyValue number
According to the data query in library, the computing cluster is different clusters from the inquiry cluster, which comprises
Data load requests are sent to the computing cluster, the data load requests carry the subregion letter of tables of data to be loaded
Breath, the data load requests indicate that the computing cluster determines the first data subregion, the subregion according to the partition information
All subregions of information instruction bind the first data subregion respectively, and the first data subregion is for storing for institute
The source data stated in the subregion of the first data partition bindings executes the resulting intermediate data of mapping tasks, so as to first number
Reduction task is executed according to the intermediate data in subregion to obtain file destination;
Request the computing cluster that the corresponding file destination of each first data subregion is sent to the inquiry cluster, with
Just the file destination is used when the tables of data to be loaded of KeyValue database carries out data query.
6. according to the method described in claim 5, owning it is characterized in that, having the second data subregion in the inquiry cluster
The corresponding Key value range of second data subregion Key value range corresponding with the first data subregion is identical, and described second
Data subregion is used to store the file destination of corresponding Key value range.
7. method according to claim 5 or 6, which is characterized in that sending data load requests to the computing cluster
Before, the method also includes:
The partition information of tables of data to be loaded is requested to the inquiry cluster.
8. a kind of computing cluster characterized by comprising
Receiving module, load request, the data load requests carry the subregion letter of tables of data to be loaded for receiving data
Breath;
Determining module, for determining the first data subregion according to the partition information, wherein the partition information instruction is owned
Subregion binds the first data subregion respectively;
Execution module, the source number of each subregion for obtaining the partition information instruction respectively from distributed file system
According to executing mapping tasks respectively to the source data of each subregion;
Writing module, the binding relationship of subregion and the first data subregion for being indicated according to the partition information, will hold
The first data subregion is accordingly written in each resulting intermediate data of mapping tasks of row;
The execution module is also used to, and is executed reduction task respectively to the intermediate data in each first data subregion, is held
Row obtains the file destination of each reduction task, the load of the file destination for the KeyValue database of inquiry cluster
Tables of data carries out data query use.
9. computing cluster according to claim 8, which is characterized in that the Key of each subregion of the partition information instruction
It is different to be worth range;For the subregion and the first data subregion with the binding relationship, the source data of the subregion
With the intermediate data Key value range having the same of the first data subregion.
10. computing cluster according to claim 8 or claim 9, which is characterized in that further include:
Sending module, for the file destination to be sent to the inquiry cluster.
11. computing cluster according to claim 10, which is characterized in that have the second data point in the inquiry cluster
Area, the corresponding Key value range of all second data subregions Key value range corresponding with the first data subregion is identical,
The second data subregion is used to store the file destination of corresponding Key value range.
12. a kind of terminal characterized by comprising
Sending module, for sending data load requests to computing cluster, the data load requests carry data to be loaded
The partition information of table, the data load requests indicate that the computing cluster determines the first data point according to the partition information
All subregions in area, the partition information instruction bind the first data subregion respectively, and the first data subregion is used
The resulting intermediate data of mapping tasks is executed in storing the source data in the subregion for the first data partition bindings, so as to
Reduction task is executed to the intermediate data in the first data subregion to obtain file destination;
Request module is looked into for requesting the computing cluster to be sent to the corresponding file destination of each first data subregion
Cluster is ask, using described when to carry out data query in the tables of data to be loaded for inquiring the KeyValue database of cluster
File destination.
13. terminal according to claim 12, which is characterized in that have the second data subregion, institute in the inquiry cluster
There is the corresponding Key value range of the second data subregion Key value range corresponding with the first data subregion identical, described
Two data subregions are used to store the file destination of corresponding Key value range.
14. terminal according to claim 12 or 13, which is characterized in that the request module is also used to:
Before requesting the partition information of tables of data to be loaded to the inquiry cluster, number to be loaded is requested to the inquiry cluster
According to the partition information of table.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610856707.1A CN106503058B (en) | 2016-09-27 | 2016-09-27 | A kind of data load method, terminal and computing cluster |
PCT/CN2017/087152 WO2018058998A1 (en) | 2016-09-27 | 2017-06-05 | Data loading method, terminal and computing cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610856707.1A CN106503058B (en) | 2016-09-27 | 2016-09-27 | A kind of data load method, terminal and computing cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106503058A CN106503058A (en) | 2017-03-15 |
CN106503058B true CN106503058B (en) | 2019-01-18 |
Family
ID=58290036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610856707.1A Active CN106503058B (en) | 2016-09-27 | 2016-09-27 | A kind of data load method, terminal and computing cluster |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106503058B (en) |
WO (1) | WO2018058998A1 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106503058B (en) * | 2016-09-27 | 2019-01-18 | 华为技术有限公司 | A kind of data load method, terminal and computing cluster |
CN110019125B (en) * | 2017-11-27 | 2021-12-14 | 北京京东尚科信息技术有限公司 | Database management method and device |
CN110083658B (en) * | 2019-03-11 | 2021-05-25 | 北京达佳互联信息技术有限公司 | Data synchronization method and device, electronic equipment and storage medium |
CN111090645B (en) * | 2019-10-12 | 2024-03-01 | 平安科技(深圳)有限公司 | Cloud storage-based data transmission method and device and computer equipment |
CN112988034B (en) * | 2019-12-02 | 2024-04-12 | 华为云计算技术有限公司 | Distributed system data writing method and device |
CN111651509B (en) * | 2020-04-30 | 2024-04-02 | 中国平安财产保险股份有限公司 | Hbase database-based data importing method and device, electronic equipment and medium |
CN112799820A (en) * | 2021-02-05 | 2021-05-14 | 拉卡拉支付股份有限公司 | Data processing method, data processing apparatus, electronic device, storage medium, and program product |
CN114860349B (en) * | 2022-07-06 | 2022-11-08 | 深圳华锐分布式技术股份有限公司 | Data loading method, device, equipment and medium |
CN118018488A (en) * | 2022-11-09 | 2024-05-10 | 华为技术有限公司 | Network cluster system, message transmission method and network equipment |
CN117271562B (en) * | 2023-11-21 | 2024-01-19 | 成都凌亚科技有限公司 | Data acquisition processing method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102594852A (en) * | 2011-01-04 | 2012-07-18 | 中国移动通信集团公司 | Data access method, node and system |
CN102833295A (en) * | 2011-06-17 | 2012-12-19 | 南京中兴新软件有限责任公司 | Data manipulation method and device in distributed cache system |
CN105138679A (en) * | 2015-09-14 | 2015-12-09 | 桂林电子科技大学 | Data processing system and method based on distributed caching |
EP2977899A2 (en) * | 2014-06-27 | 2016-01-27 | General Electric Company | Integrating execution of computing analytics within a mapreduce processing environment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106503058B (en) * | 2016-09-27 | 2019-01-18 | 华为技术有限公司 | A kind of data load method, terminal and computing cluster |
-
2016
- 2016-09-27 CN CN201610856707.1A patent/CN106503058B/en active Active
-
2017
- 2017-06-05 WO PCT/CN2017/087152 patent/WO2018058998A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102594852A (en) * | 2011-01-04 | 2012-07-18 | 中国移动通信集团公司 | Data access method, node and system |
CN102833295A (en) * | 2011-06-17 | 2012-12-19 | 南京中兴新软件有限责任公司 | Data manipulation method and device in distributed cache system |
EP2977899A2 (en) * | 2014-06-27 | 2016-01-27 | General Electric Company | Integrating execution of computing analytics within a mapreduce processing environment |
CN105138679A (en) * | 2015-09-14 | 2015-12-09 | 桂林电子科技大学 | Data processing system and method based on distributed caching |
Also Published As
Publication number | Publication date |
---|---|
WO2018058998A1 (en) | 2018-04-05 |
CN106503058A (en) | 2017-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106503058B (en) | A kind of data load method, terminal and computing cluster | |
CN105765578B (en) | Parallel access of data in a distributed file system | |
US9749445B2 (en) | System and method for updating service information for across-domain messaging in a transactional middleware machine environment | |
US20160132541A1 (en) | Efficient implementations for mapreduce systems | |
JP2018518744A (en) | Automatic scaling of resource instance groups within a compute cluster | |
CN106331153B (en) | A kind of filter method of service request, apparatus and system | |
CN103312624A (en) | Message queue service system and method | |
AU2021269201B2 (en) | Utilizing coherently attached interfaces in a network stack framework | |
CN107391033B (en) | Data migration method and device, computing equipment and computer storage medium | |
TW202008763A (en) | Data processing method and apparatus, and client | |
CN114281263B (en) | Storage resource processing method, system and equipment of container cluster management system | |
CN102929958A (en) | Metadata processing method, agenting and forwarding equipment, server and computing system | |
CN114036031B (en) | Scheduling system and method for resource service application in enterprise digital middleboxes | |
CN108337116A (en) | Message order-preserving method and device | |
CN112860412B (en) | Service data processing method and device, electronic equipment and storage medium | |
US11962476B1 (en) | Systems and methods for disaggregated software defined networking control | |
CN108829340B (en) | Storage processing method, device, storage medium and processor | |
CN105765542B (en) | Access method, distributed memory system and the memory node of file | |
US11263184B1 (en) | Partition splitting in a distributed database | |
CN111262904A (en) | Service agent system and method | |
US20100281132A1 (en) | Multistage online transaction system, server, multistage online transaction processing method and program | |
CN114172895A (en) | Routing method, routing device, computer equipment and storage medium | |
CN114244905B (en) | Data forwarding method, device, computer equipment and storage medium | |
CN116775510B (en) | Data access method, device, server and computer readable storage medium | |
US10515027B2 (en) | Storage device sharing through queue transfer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220217 Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province Patentee after: Huawei Cloud Computing Technology Co.,Ltd. Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd. |