CN106339432A

CN106339432A - System and method for balancing load according to content to be inquired

Info

Publication number: CN106339432A
Application number: CN201610688791.0A
Authority: CN
Inventors: 李晓华
Original assignee: SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd
Current assignee: SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd
Priority date: 2016-08-19
Filing date: 2016-08-19
Publication date: 2017-01-18

Abstract

The invention discloses a system for balancing a load according to a content to be inquired. The system comprises a data portioning server, a master server, a node server and a front end display module, and the master server comprises a node index storage unit, a thread distribution unit, a simplifying unit, a main processor and a temporary table storage unit. The invention further discloses a method for balancing the load according to the content to be inquired. By means of parallel computation of a plurality of nodes, the calculation of a large database is distributed to a plurality of node databases, the capability of computing through multiple machines and cores is played, and the query speed or the statistical speed of the database with a large data volume is greatly increased. The load of the content to be inquired is balanced to reduce unnecessary concurrent computation, the concurrent access capability of the whole system is improved by times or dozens of times under the same hardware condition, the system and the method do not rely on special hardware or network, and can be implemented through a common PC and a gigabit network or even a 100 M network, and the cost performance is very high.

Description

A kind of system and method carrying out load balance according to query contents

Technical field

The present invention relates to a kind of inquiry of database or statistical system, specifically one kind carries out load balance according to query contents System.

Background technology

Development with computer technology and popularization, large database promptly enters into the industry-by-industries such as telecommunications, finance. Sql(structured query language, SQL) it is the operation commands set aiming at database and setting up, It is a kind of database language.The major function of sql is exactly to contact with various Databases, makes between different types of database Linked up.According to ansi(ANSI) regulation, sql is by the standard as Relational DBMS Language.It is only necessary to send the order of " what does " when using sql, without consideration " how doing ".Sql sentence can be used To execute the various operations to database, for example, to update the data the data in storehouse, to extract data etc. from database.Mesh Before, most popular Relational DBMSs, such as qracle, sybase, microsoft sql server, Access etc. employs sql language standard.However, going deep into informatization, all trades and professions all establish substantial amounts of Database, and the data volume of these databases is also increasing, limits the inquiry to database and Statistical Speed.For example in meter In charge system, miscellaneous service program needs to carry out frequently inquiry operation to the data in database, and the data volume being related to is very Huge, the frequency accessing database is very high, and thus excessive database interaction leads to the performance of computer program to reduce.

In order to improve inquiry and the Statistical Speed of database, the most frequently used mode is that hardware system is optimized, for example The patent application of Patent Office of the People's Republic of China's Application No. 200610041548.6, it proposes a kind of side of accelerating database searching speed Method, as shown in figure 1, it passes through to open up the common memory section for depositing data data index in Installed System Memory, by guarding Data data index in database is called in corresponding common memory section to enter for business industry in the way of agreement by process respectively Journey is called, and by finger daemon timing or circulation, the record in database is inquired about, in time by the data content more connecing simultaneously Recorded in above-mentioned common memory section.

The method of this accelerating database searching speed can improve the inquiry velocity of database to a certain extent, reduces Dependence to database performance.But for the inquiry for high-volume database or statistics, due to the restriction of hardware computation speed, This method can not fundamentally solve the slow-footed problem of data base querying, and the lifting of computing power, such as improve cpu Frequency, increase internal memory, raising disk access speeds etc., its room for promotion is limited, and the upgrading of hardware performance needs to put in a large number Fund cost.Thus solve the problem rate of large data library inquiry or statistics how effectively, always one need Problem to be solved.At present in most of distributed system, the method for data distribution, adopting random distribution or hash distribution more. These methods are all between mathematical method, and service distribution to be unmatched.The ID card No. of such as people, if according to hash Distribution, then the data of the people of a province, also can be distributed on all nodes.When inquiring about this province's data, access certain province It is necessary to access all nodes when data.When there being big concurrently access, the concurrency performance of system will be poor, and this is just Use for people brings inconvenience.

Content of the invention

It is an object of the invention to provide a kind of system carrying out load balance according to query contents, to solve above-mentioned background skill The problem proposing in art.

For achieving the above object, the following technical scheme of present invention offer:

A kind of system carrying out load balance according to query contents, including data segmentation server, master server, node server and Front end display module, described master server is connected with data segmentation server, front end display module and node server respectively, section The quantity of point server is two or more, and data segmentation server is connected with the source database depositing mass data, and data is divided Cut and be connected by communication between server and the node server with operation independent disposal ability, master server includes node index and deposits Storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node index storage unit respectively with Data segmentation server, simplified element and thread allocation unit are connected, and thread allocation unit is connected with node server, main place Reason device is connected with interim table memory cell.

As the further scheme of the present invention: node server includes modal processor, modal processor and node data Storehouse is connected, and node server adopts common pc machine.

As the further scheme of the present invention: by wired or wireless between data segmentation server and node server Mode is connected.

The described method carrying out load balance according to query contents, specifically comprises the following steps that

Step one, arranges multiple node databases；

Step 2, the mass data in source database is split by data segmentation server according to rule, in partition data, If data content is little with the data volume of the corresponding informance of node server, can be directly as index information；And if number Larger with the data volume ratio of the corresponding informance of node server according to content, then it is time-consuming long to be likely to result in parsing, thus can profit With simplified element, index information is simplified, to improve the analyzing efficiency of thread allocation unit, then by the data after segmentation It is assigned in the node database of each node server；

Step 3, according to segmentation rule, formation during partition data represents the data content assigned by each node database Index information, index information is left in node index storage unit；

Step 4, according to segmentation rule, simplifies to index information；

Step 5, thread allocation unit parses to inquiry or statistical parameter, and combines the rope in node index storage unit Fuse ceases, and distributes the query or statistical task of each node database, finds out specific corresponding to each ad hoc inquiry or statistics Node；

Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database Device.Wherein, each node database all can carry out independent computing, thus each node database all can share one Part query or statistical task, and greatly improve the access efficiency of database；

Step 7, if the result set data volume that receives of master server less or node server 12 quantity few, The inquiry of node server or statistics directly can be transferred to front end display module by master server；And if node serve The data volume that the quantity of device is more or node server returns to master server is larger, then can inquiry or statistics multiple Make in interim table memory cell, and one interim table of generation is collected by interim table memory cell；

Step 8, master server is inquired about again to the information of interim table or is counted, and forms final result set, will be final Result set be transferred to front end display module, front end display module by forms such as the data genaration receiving figure, forms, and with Technician realizes interaction.

Compared with prior art, the invention has the beneficial effects as follows: the present invention by way of multi-node parallel computing, by one The operand of individual large database distributes to multiple node databases, such that it is able to give full play to multimachine and multinuclear calculates simultaneously Ability, can be greatly enhanced the query or statistical rate in Volume data storehouse, with respect to the mode of configuration of optimizing hardware, this Invention will not be limited by room for promotion, and inquiry or Statistical Rate can improve 10 times, 100 times even 1000 times；The present invention Carry out load balancing using to the content inquired about, to each inquiry, first judge the node that this inquiry may access in advance, permissible Greatly reduce unnecessary parallel computation, under same hardware condition, can even tens times of raising whole system at double Concurrent access ability；Node server adopts common pc machine, required with respect to the optimization of master server hardware configuration Cost, on the premise of lifting identical inquiry or Statistical Rate, increases node server input cost less；The present invention disobeys Rely in special hardware and network, common pc machine and gigabit networking even 100,000,000 networks are it is achieved that cost performance is very high.

Brief description

Fig. 1 is the structural representation of the system carrying out load balance according to query contents.

Fig. 2 is the workflow diagram of the system carrying out load balance according to query contents.

Fig. 3 is the source database schematic diagram of big data quantity in the system carrying out load balance according to query contents.

Fig. 4 is the parsing schematic diagram of the source database of data volume in the system carrying out load balance according to query contents.

Wherein: 11- master server, 12- node server, 13- source database, 14- data splits server, the main place of 15- Reason device, 16- interim table memory cell, 17- node database, 18- modal processor, 19- front end display module, 20- node rope Draw memory cell, 21- thread allocation unit, 22- simplified element.

Specific embodiment

With reference to specific embodiment, the technical scheme of this patent is described in more detail.

Refer to Fig. 1-2, a kind of system carrying out load balance according to query contents, split server 14, master including data Server 11, node server 12 and front end display module 19, described master server 11 splits server 14, front with data respectively End display module 19 is connected with node server 12, and the quantity of node server 12 is two or more, and data splits server 14 It is connected with the source database 13 depositing mass data, data splits server 14 and the node with operation independent disposal ability It is connected by communication between server 12, master server 11 includes node index storage unit 20, thread allocation unit 21, simplified element 22nd, primary processor 15 and interim table memory cell 16, node index storage unit 20 is split server 14 respectively, is simplified with data Unit 22 and thread allocation unit 21 are connected, and thread allocation unit 21 is connected with node server 12, primary processor 15 with face When table memory cell 16 be connected.Node server 12 includes modal processor 18, modal processor 18 and node database 17 phase Even, node server 12 adopts common pc machine.By wired or nothing between data segmentation server 14 and node server 12 The mode of line is connected.

Specific embodiment 1

Refering to Fig. 3-4, Fig. 3 is the source database schematic diagram of a big data quantity.This source database includes four tables of data: Store table, sales table, time table and product table, data volume is 400,000,100,000,000,1825 and 1000 respectively.First have to source The data of database is split, and is assigned in each node database.Data volume ratio due to store table and sales table Larger, time table and product table data volume less, therefore to store table and sales table, are split by store field, Time table and product table are not split, and are copied directly to each node database.During partition data, city word can also be added Section, region field is ranked up, and the data in one city of guarantee or a region is in a node database or adjacent as far as possible On node database.

Form index information then according to segmentation rule.Assume to be split according to store title here, then formed Store title and the corresponding informance of node server.For convenience of description, now by store title be divided into store1, Store2, store3 ..., then index information can be represented with table 1:

Table 1

Node server	Store title
		n1	store1
n2	store2
		n3	store3

And if when the data volume of index information is larger (specific name of such as store is long, or store quantity is more), Simplification process can be carried out to index information.The corresponding table of a upper strata Classifying Sum field for example can be produced, as table 2 institute Show:

Table 2

Node server	Store title	The top level domain of store title
			n1	store1	a
n2	store2	b
			n3	store3	c

Wherein, the top level domain in table 2 represents store title respectively with alphabetical a, b, b, and in addition, top level domain also may be used With using the abbreviation of store title, specific symbol etc., it act as simplifying search, reduces the time of parsing.

When user will access database, it is inquired about or statistical parameter parses, and combine index information, distribution is each The query or statistical task of individual node database.Inquiry described here or statistical parameter can be inquiry content, user right Deng.For example, when user needs the related data inquiring about store1 and store2, because the data of store1 and store2 is point It is not divided on two nodes of n1 and n2, therefore according to index information, system can be produced on producing two nodes of n1 and n2 Raw two threads carry out concurrent operation, without producing thread on n3 node, so have many user accesses data when simultaneously The operand of system during storehouse, will be greatly reduced.Certainly, if certain user is the director of store1, and it can only have inquiry During the authority of store1 related data, even if the inquiry content of this user includes store1 and store2, system can take in inquiry Hold the common factor (i.e. resolving) with user right, thus a thread is only produced on n1 node.

After being assigned with thread task, the data of each node database to be inquired about or be counted, each be saved Point data base execution sql instruction.Then the result set of each node is imported to interim table, inquired about again after being collected or unite Meter, the interim table importing being completed executes sql instruction again, such that it is able to obtain final result set.Finally by result set Pass to front end display module, shown using the various controls (such as form, figure) that represent

Above the better embodiment of this patent is explained in detail, but this patent is not limited to above-mentioned embodiment, In the ken that those of ordinary skill in the art possesses, can also make each on the premise of without departing from this patent objective Plant change.

Claims

1. a kind of system carrying out load balance according to query contents is it is characterised in that include data segmentation server, main service Device, node server and front end display module, described master server splits server, front end display module and section with data respectively Point server is connected, and the quantity of node server is two or more, and data splits server and the source number depositing mass data It is connected according to storehouse, be connected by communication between data segmentation server and the node server with operation independent disposal ability, main service Device includes node index storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node rope Draw memory cell to be connected with data segmentation server, simplified element and thread allocation unit respectively, thread allocation unit and section Point server is connected, and primary processor is connected with interim table memory cell.

2. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described node serve Device includes modal processor, and modal processor is connected with node database, and node server adopts common pc machine.

3. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described data is split It is connected by way of wired or wireless between server and node server.

4. a kind of method of work of the described system carrying out load balance according to query contents as arbitrary in claim 1-3, it is special Levy and be, specifically comprise the following steps that

Step one, arranges multiple node databases；

Step 4, according to segmentation rule, simplifies to index information；

Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database Device.

5. wherein, each node database all can carry out independent computing, thus each node database all can be shared A part of query or statistical task, and greatly improve the access efficiency of database；