CN104462435A - Lateral extension method of distributed database - Google Patents

Lateral extension method of distributed database Download PDF

Info

Publication number
CN104462435A
CN104462435A CN201410778493.1A CN201410778493A CN104462435A CN 104462435 A CN104462435 A CN 104462435A CN 201410778493 A CN201410778493 A CN 201410778493A CN 104462435 A CN104462435 A CN 104462435A
Authority
CN
China
Prior art keywords
data
database
node
middle layer
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410778493.1A
Other languages
Chinese (zh)
Inventor
陈琳
宋洋
胡光龙
李楠
李亚龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TONGFANG KNOWLEDGE NETWORK DIGITAL PUBLICATION TECHNOLOGY Co Ltd
Original Assignee
TONGFANG KNOWLEDGE NETWORK DIGITAL PUBLICATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TONGFANG KNOWLEDGE NETWORK DIGITAL PUBLICATION TECHNOLOGY Co Ltd filed Critical TONGFANG KNOWLEDGE NETWORK DIGITAL PUBLICATION TECHNOLOGY Co Ltd
Priority to CN201410778493.1A priority Critical patent/CN104462435A/en
Publication of CN104462435A publication Critical patent/CN104462435A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Abstract

The invention discloses a lateral extension method of a distributed database. The method includes: creating a central node; creating data nodes; deploying intermediate layer nodes which select the data nodes according to load weights and probabilities; redundantly creating a data table on the data nodes; creating a remote table on the intermediate layer nodes; performing data query through the intermediate layer nodes, and transmitting data query results to a client. By the use of the method, a single query point of a database has lower processing capacity in case of mass accesses; by effective query distribution, query performance of the database is improved.

Description

A kind of distributed data base method extending transversely
Technical field
The present invention relates to database method extending transversely, particularly relate to a kind of distributed data base method extending transversely.
Background technology
Database technology results from beginning of the seventies late 1960s, is an important branch of computer software technology, it mainly study how to store, management and data.Be one of with fastest developing speed, most widely used computer technology, be an extremely emphasis paying close attention to of infotech circle always.
Traditional database technology many employings single cpu mode provides database service, and database service is on a single computer resident.Unit database model is simple, database research staff concentration is in the technology such as data access, search index optimization improving database, along with the maturation of technology, the evolution of technology of unit database has developed into ultimate attainment, and standalone version database more and more depends on the raising of machine performance.And the high cost of high performance computing machine makes the performance boost of unit database run into obstacle; and the huge inferior position that unit database cannot be avoided is Single Point of Faliure; database encounters problems and can only shut down; restart after fault recovery is complete, and database failure recovers to depend on data base administrator's manual backup.Along with the development of internet and the arrival of large data age, the more and more generation of multiple access and mass data, unit database is made to encounter huge challenge on capacity and Concurrency Access.Distributed schemes becomes a key of large data age database technology.
Distributed data base system solves mass memory and the Concurrency Access problem of database by distributedization that data calculate and data store.Distributed data base system is deployed on the cluster of multiple stage computing machine composition, and the computer node in cluster is connected to each other by network, mutual cooperation, common composition one complete, unified in logic, the large database that physically distributes of the overall situation.
Summary of the invention
For solving the problems of the technologies described above, the object of this invention is to provide a kind of distributed data base method extending transversely.
Object of the present invention is realized by following technical scheme:
A kind of distributed data base method extending transversely, comprising:
Creation database Centroid;
Creation database back end;
Dispose middle layer node, this middle layer node presses probability selection back end according to load weight;
On described database data node, redundancy creates tables of data;
Described middle layer node creates long-range table;
Carry out data query by middle layer node, and data query result is sent to client.
Compared with prior art, one or more embodiment of the present invention can have the following advantages by tool:
These ranks mixing storage means effectively prevent visit capacity excessive time data base querying standalone processes ability decline, by effectively dividing continuous query, improve the query performance of database.
The data acquisition redundant deployment mode of distributed data base system, namely a data cell has many parts of redundant deployment on cluster, when a corrupted data, other Backup Datas can provide data access capabilities, and can rebuild or Recover from damaging data by automatic Reconstruction, and when same number certificate has inquiry request, many parts of redundant datas can share inquiry request pressure, and this makes the performance of distributed database, extendability (simple increase computing node), stability and disaster tolerance greatly promote.These advantages of distributed data base make it become the main flow of Future Data storehouse technical development.
Accompanying drawing explanation
Fig. 1 is database hub node, database data node and middle layer node architectural configurations figure;
Fig. 2 is distributed data base method flow diagram extending transversely;
Fig. 3 is load-balancing algorithm schematic diagram.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail.
As shown in Figure 1, be database hub node, database data node and middle layer node architectural configurations.
As shown in Figure 2, be distributed data base method extending transversely, the method comprises the following steps:
Step 10 creation database Centroid;
Starter node database service also configures node centered by this node, and Centroid mainly collects the distributed intelligence of tables of data, and responds the table distributed intelligence inquiry of middle layer node, and Centroid can be configured to two center mode, prevents single point failure.
Step 20 creation database back end;
Starter node database service to configure this node be back end, allocation cluster Centroid information, specify the Centroid link information of this back end, this Centroid link information comprises: main central database service, central database service for subsequent use, IP and port numbers.The real data of information data node data table memory, back end regularly sends heartbeat message and the data table information of self to Centroid, and responds the data inquiry request of middle layer node.
Step 30 disposes middle layer node;
Starter node database service to configure this node be middle layer node, allocation cluster Centroid information, specify the Centroid link information in this middle layer, this main Centroid link information comprises: main central database service, central database service for subsequent use, IP and port numbers.The tables of data distributed intelligence of the middle layer node periodically long-range table of Help Center's node.The long-range table response data inquiry in middle layer, according to the distributed intelligence of tables of data and the load information of statistics, selects a back end, inquiry is redirected to this back end.
Step 40 redundancy creates tables of data;
Back end creates tables of data, and tables of data can redundancy create, on multiple back end, namely create the example of tables of data.In cluster, the tables of data of different structure must use different titles.
Step 50 creates long-range table on the intermediate node;
The structure of a long-range table data table memory and definition information, the not real data of data table memory.
Step 60 client initiates data inquiry request to intermediate node, and middle layer node customer in response end is inquired about, and is redirected inquiry request.
Intermediate node resolves SQL statement, find corresponding long-range table example, call query interface, query interface is according to the back end distributed intelligence of this tables of data of collecting and responsible information, select the back end that optimum, this inquiry request be forwarded on this back end, middle layer receives the result of query execution of this back end, and result set is transferred to client.
Cluster runs the following functional module of main dependence: cluster table distributed intelligence collecting function module, cluster table distributed intelligence issuing function module, cluster table distributed intelligence query function module, node load information functional module, cluster table inquiry request load-balancing function module.
Above-mentioned cluster table distributed intelligence collecting function module, is responsible for by Centroid, and after Centroid starts, namely class of jobs is collected in the distributed intelligence of instantiation cluster table, and is thrown in the task pool of database and performs.The table distributed intelligence that operation reception back end reports is collected in cluster table distributed intelligence, safeguard the table distributed intelligence of whole cluster, and response is from the table distributed intelligence inquiry request of middle layer node.
Cluster table distributed intelligence issuing function module, is completed by back end, and after Centroid starts, namely class of jobs is issued in the distributed intelligence of instantiation cluster table, is thrown in the task pool of database and performs.Cluster table distributed intelligence is issued property duty cycle and is performed, and all data table information of back end are sent to Centroid.
Cluster table distributed intelligence query function module, node load information functional module, cluster table inquiry request load-balancing function module are responsible for by middle layer node, after Centroid starts, the i.e. corresponding class of jobs of instantiation, according to the long-range table deployment scenario of self, periodically inquire about long-range table distributed intelligence to Centroid, and according to Query Result, safeguard in internal memory and upgrade table distributed intelligence and the load information of long-range table.After receiving the inquiry request of client, according to load information and load-balancing algorithm, inquiry is redirected to corresponding back end, and detects query execution information, upgrade node load information.
Load-balancing algorithm: produce inquiry transient hot spot (inquiry request is redirected to load optimal data node by middle layer node simultaneously) for avoiding data, the present embodiment takes the schedule by probability mode algorithm based on load, according to load weight (low load high weight) by probability selection back end, the low node of load has higher probability to be selected.As shown in Figure 3, the weight of three nodes is respectively w a, w b, w c, build three interval [0, w a), [w a, w a+ w b) and [w a+ w b, w a+ w b+ w c], then a random scope is at [0, w a+ w b+ w c] between numeral, even if numeral drop on that interval in node of its representative; It is selected that the node with high weight has higher probability, and the node with lower weight also has selected possibility, avoids the appearance of node selection suboptimization.
Although the embodiment disclosed by the present invention is as above, the embodiment that described content just adopts for the ease of understanding the present invention, and be not used to limit the present invention.Technician in any the technical field of the invention; under the prerequisite not departing from the spirit and scope disclosed by the present invention; any amendment and change can be done what implement in form and in details; but scope of patent protection of the present invention, the scope that still must define with appending claims is as the criterion.

Claims (7)

1. a distributed data base method extending transversely, is characterized in that, described method comprises:
Creation database Centroid;
Creation database back end;
Dispose middle layer node, this middle layer node presses probability selection back end according to load weight;
On described database data node, redundancy creates tables of data;
Described middle layer node creates long-range table;
Carry out data query by middle layer node, and data query result is sent to client.
2. distributed data base method extending transversely as claimed in claim 1, is characterized in that, described Centroid for collecting the distributed intelligence of tables of data, and responds the table distributed intelligence inquiry of middle layer node.
3. distributed data base method extending transversely as claimed in claim 1, is characterized in that, respectively the Centroid information of configuration database back end and middle layer node, and specifies described Centroid link information.
4. distributed data base method extending transversely as claimed in claim 3, it is characterized in that, described Centroid link information comprises: main central database service, central database service for subsequent use, IP and port numbers.
5. distributed data base method extending transversely as claimed in claim 1, is characterized in that, described middle layer node is used for the tables of data distributed intelligence of the long-range table of Help Center's node.
6. distributed data base method extending transversely as claimed in claim 1, it is characterized in that, multiple database data node creates tables of data, wherein, the tables of data of different structure uses different titles.
7. distributed data base method extending transversely as claimed in claim 1, is characterized in that, data table memory structure and definition information on described long-range table.
CN201410778493.1A 2014-12-15 2014-12-15 Lateral extension method of distributed database Pending CN104462435A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410778493.1A CN104462435A (en) 2014-12-15 2014-12-15 Lateral extension method of distributed database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410778493.1A CN104462435A (en) 2014-12-15 2014-12-15 Lateral extension method of distributed database

Publications (1)

Publication Number Publication Date
CN104462435A true CN104462435A (en) 2015-03-25

Family

ID=52908470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410778493.1A Pending CN104462435A (en) 2014-12-15 2014-12-15 Lateral extension method of distributed database

Country Status (1)

Country Link
CN (1) CN104462435A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760510A (en) * 2016-02-24 2016-07-13 浪潮通用软件有限公司 Database horizontal extension method of software service system
CN107368500A (en) * 2016-05-13 2017-11-21 北京京东尚科信息技术有限公司 Data pick-up method and system
CN107423188A (en) * 2016-03-07 2017-12-01 阿里巴巴集团控股有限公司 Log processing method and equipment
CN110019345A (en) * 2017-12-28 2019-07-16 北京京东尚科信息技术有限公司 Data processing method, device, system and medium
CN110109960A (en) * 2019-04-24 2019-08-09 孟晓丽 A kind of data acquisition expansion control system and its collecting method
CN111125101A (en) * 2019-12-16 2020-05-08 杭州涂鸦信息技术有限公司 Data center table structure consistency monitoring method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071434A (en) * 2007-05-14 2007-11-14 腾讯科技(深圳)有限公司 User distributing method, device and system for distributed database system
CN101841565A (en) * 2010-04-20 2010-09-22 中国科学院软件研究所 Database cluster system load balancing method and database cluster system
CN102299959A (en) * 2011-08-22 2011-12-28 北京邮电大学 Load balance realizing method of database cluster system and device
CN102447719A (en) * 2010-10-12 2012-05-09 上海遥薇(集团)有限公司 Dynamic load balancing information processing system for Web GIS service
US20140310259A1 (en) * 2013-04-15 2014-10-16 Vmware, Inc. Dynamic Load Balancing During Distributed Query Processing Using Query Operator Motion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071434A (en) * 2007-05-14 2007-11-14 腾讯科技(深圳)有限公司 User distributing method, device and system for distributed database system
CN101841565A (en) * 2010-04-20 2010-09-22 中国科学院软件研究所 Database cluster system load balancing method and database cluster system
CN102447719A (en) * 2010-10-12 2012-05-09 上海遥薇(集团)有限公司 Dynamic load balancing information processing system for Web GIS service
CN102299959A (en) * 2011-08-22 2011-12-28 北京邮电大学 Load balance realizing method of database cluster system and device
US20140310259A1 (en) * 2013-04-15 2014-10-16 Vmware, Inc. Dynamic Load Balancing During Distributed Query Processing Using Query Operator Motion

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760510A (en) * 2016-02-24 2016-07-13 浪潮通用软件有限公司 Database horizontal extension method of software service system
CN107423188A (en) * 2016-03-07 2017-12-01 阿里巴巴集团控股有限公司 Log processing method and equipment
CN107423188B (en) * 2016-03-07 2021-05-07 阿里巴巴集团控股有限公司 Log processing method and device
CN107368500A (en) * 2016-05-13 2017-11-21 北京京东尚科信息技术有限公司 Data pick-up method and system
CN110019345A (en) * 2017-12-28 2019-07-16 北京京东尚科信息技术有限公司 Data processing method, device, system and medium
CN110109960A (en) * 2019-04-24 2019-08-09 孟晓丽 A kind of data acquisition expansion control system and its collecting method
CN111125101A (en) * 2019-12-16 2020-05-08 杭州涂鸦信息技术有限公司 Data center table structure consistency monitoring method and system
CN111125101B (en) * 2019-12-16 2023-10-13 杭州涂鸦信息技术有限公司 Data center table structure consistency monitoring method and system

Similar Documents

Publication Publication Date Title
CN104462435A (en) Lateral extension method of distributed database
US10747714B2 (en) Scalable distributed data store
CN109656911B (en) Distributed parallel processing database system and data processing method thereof
CN102693324B (en) Distributed database synchronization system, synchronization method and node management method
EP2932370B1 (en) System and method for performing a transaction in a massively parallel processing database
CN106202346B (en) A kind of data load cleaning engine, scheduling and storage system
CN102917025B (en) Method for business migration based on cloud computing platform
CN105138615A (en) Method and system for building big data distributed log
EP3764244A1 (en) System and method for massively parallel processing database
CN104484472B (en) A kind of data-base cluster and implementation method of a variety of heterogeneous data sources of mixing
CN107679192A (en) More cluster synergistic data processing method, system, storage medium and equipment
CN103581332B (en) HDFS framework and pressure decomposition method for NameNodes in HDFS framework
CN105224637A (en) A kind of based on PostgreSQL database active and standby/the comprehensive method of cluster application
CN104239417A (en) Dynamic adjustment method and dynamic adjustment device after data fragmentation in distributed database
CN105426427A (en) MPP database cluster replica realization method based on RAID 0 storage
CN110417883A (en) A kind of design method of the point to point network structure applied to block chain
CN103279386A (en) Method for achieving high availability of computer operation scheduling system
CN103399894A (en) Distributed transaction processing method on basis of shared storage pool
Samwel et al. F1 query: Declarative querying at scale
CN110175089A (en) A kind of dual-active disaster recovery and backup systems with read and write abruption function
CN103327116A (en) Dynamic copy storage method for network file
CN106156319A (en) Telescopic distributed resource description framework data storage method and device
CN104281980A (en) Remote diagnosis method and system for thermal generator set based on distributed calculation
CN102880832A (en) Method for implementing mass data management system under colony
Li et al. Apache shardingsphere: A holistic and pluggable platform for data sharding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Chen Lin

Inventor after: Song Yang

Inventor after: Hu Guanglong

Inventor after: Li Nan

Inventor after: Li Yalong

Inventor after: He Chaohui

Inventor before: Chen Lin

Inventor before: Song Yang

Inventor before: Hu Guanglong

Inventor before: Li Nan

Inventor before: Li Yalong

COR Change of bibliographic data
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150325