CN106339432A - System and method for balancing load according to content to be inquired - Google Patents

System and method for balancing load according to content to be inquired Download PDF

Info

Publication number
CN106339432A
CN106339432A CN201610688791.0A CN201610688791A CN106339432A CN 106339432 A CN106339432 A CN 106339432A CN 201610688791 A CN201610688791 A CN 201610688791A CN 106339432 A CN106339432 A CN 106339432A
Authority
CN
China
Prior art keywords
node
server
data
database
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610688791.0A
Other languages
Chinese (zh)
Inventor
李晓华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI JUSHU INFORMATION TECHNOLOGY Co Ltd
Priority to CN201610688791.0A priority Critical patent/CN106339432A/en
Publication of CN106339432A publication Critical patent/CN106339432A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a system for balancing a load according to a content to be inquired. The system comprises a data portioning server, a master server, a node server and a front end display module, and the master server comprises a node index storage unit, a thread distribution unit, a simplifying unit, a main processor and a temporary table storage unit. The invention further discloses a method for balancing the load according to the content to be inquired. By means of parallel computation of a plurality of nodes, the calculation of a large database is distributed to a plurality of node databases, the capability of computing through multiple machines and cores is played, and the query speed or the statistical speed of the database with a large data volume is greatly increased. The load of the content to be inquired is balanced to reduce unnecessary concurrent computation, the concurrent access capability of the whole system is improved by times or dozens of times under the same hardware condition, the system and the method do not rely on special hardware or network, and can be implemented through a common PC and a gigabit network or even a 100 M network, and the cost performance is very high.

Description

A kind of system and method carrying out load balance according to query contents
Technical field
The present invention relates to a kind of inquiry of database or statistical system, specifically one kind carries out load balance according to query contents System.
Background technology
Development with computer technology and popularization, large database promptly enters into the industry-by-industries such as telecommunications, finance. Sql(structured query language, SQL) it is the operation commands set aiming at database and setting up, It is a kind of database language.The major function of sql is exactly to contact with various Databases, makes between different types of database Linked up.According to ansi(ANSI) regulation, sql is by the standard as Relational DBMS Language.It is only necessary to send the order of " what does " when using sql, without consideration " how doing ".Sql sentence can be used To execute the various operations to database, for example, to update the data the data in storehouse, to extract data etc. from database.Mesh Before, most popular Relational DBMSs, such as qracle, sybase, microsoft sql server, Access etc. employs sql language standard.However, going deep into informatization, all trades and professions all establish substantial amounts of Database, and the data volume of these databases is also increasing, limits the inquiry to database and Statistical Speed.For example in meter In charge system, miscellaneous service program needs to carry out frequently inquiry operation to the data in database, and the data volume being related to is very Huge, the frequency accessing database is very high, and thus excessive database interaction leads to the performance of computer program to reduce.
In order to improve inquiry and the Statistical Speed of database, the most frequently used mode is that hardware system is optimized, for example The patent application of Patent Office of the People's Republic of China's Application No. 200610041548.6, it proposes a kind of side of accelerating database searching speed Method, as shown in figure 1, it passes through to open up the common memory section for depositing data data index in Installed System Memory, by guarding Data data index in database is called in corresponding common memory section to enter for business industry in the way of agreement by process respectively Journey is called, and by finger daemon timing or circulation, the record in database is inquired about, in time by the data content more connecing simultaneously Recorded in above-mentioned common memory section.
The method of this accelerating database searching speed can improve the inquiry velocity of database to a certain extent, reduces Dependence to database performance.But for the inquiry for high-volume database or statistics, due to the restriction of hardware computation speed, This method can not fundamentally solve the slow-footed problem of data base querying, and the lifting of computing power, such as improve cpu Frequency, increase internal memory, raising disk access speeds etc., its room for promotion is limited, and the upgrading of hardware performance needs to put in a large number Fund cost.Thus solve the problem rate of large data library inquiry or statistics how effectively, always one need Problem to be solved.At present in most of distributed system, the method for data distribution, adopting random distribution or hash distribution more. These methods are all between mathematical method, and service distribution to be unmatched.The ID card No. of such as people, if according to hash Distribution, then the data of the people of a province, also can be distributed on all nodes.When inquiring about this province's data, access certain province It is necessary to access all nodes when data.When there being big concurrently access, the concurrency performance of system will be poor, and this is just Use for people brings inconvenience.
Content of the invention
It is an object of the invention to provide a kind of system carrying out load balance according to query contents, to solve above-mentioned background skill The problem proposing in art.
For achieving the above object, the following technical scheme of present invention offer:
A kind of system carrying out load balance according to query contents, including data segmentation server, master server, node server and Front end display module, described master server is connected with data segmentation server, front end display module and node server respectively, section The quantity of point server is two or more, and data segmentation server is connected with the source database depositing mass data, and data is divided Cut and be connected by communication between server and the node server with operation independent disposal ability, master server includes node index and deposits Storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node index storage unit respectively with Data segmentation server, simplified element and thread allocation unit are connected, and thread allocation unit is connected with node server, main place Reason device is connected with interim table memory cell.
As the further scheme of the present invention: node server includes modal processor, modal processor and node data Storehouse is connected, and node server adopts common pc machine.
As the further scheme of the present invention: by wired or wireless between data segmentation server and node server Mode is connected.
The described method carrying out load balance according to query contents, specifically comprises the following steps that
Step one, arranges multiple node databases;
Step 2, the mass data in source database is split by data segmentation server according to rule, in partition data, If data content is little with the data volume of the corresponding informance of node server, can be directly as index information;And if number Larger with the data volume ratio of the corresponding informance of node server according to content, then it is time-consuming long to be likely to result in parsing, thus can profit With simplified element, index information is simplified, to improve the analyzing efficiency of thread allocation unit, then by the data after segmentation It is assigned in the node database of each node server;
Step 3, according to segmentation rule, formation during partition data represents the data content assigned by each node database Index information, index information is left in node index storage unit;
Step 4, according to segmentation rule, simplifies to index information;
Step 5, thread allocation unit parses to inquiry or statistical parameter, and combines the rope in node index storage unit Fuse ceases, and distributes the query or statistical task of each node database, finds out specific corresponding to each ad hoc inquiry or statistics Node;
Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database Device.Wherein, each node database all can carry out independent computing, thus each node database all can share one Part query or statistical task, and greatly improve the access efficiency of database;
Step 7, if the result set data volume that receives of master server less or node server 12 quantity few, The inquiry of node server or statistics directly can be transferred to front end display module by master server;And if node serve The data volume that the quantity of device is more or node server returns to master server is larger, then can inquiry or statistics multiple Make in interim table memory cell, and one interim table of generation is collected by interim table memory cell;
Step 8, master server is inquired about again to the information of interim table or is counted, and forms final result set, will be final Result set be transferred to front end display module, front end display module by forms such as the data genaration receiving figure, forms, and with Technician realizes interaction.
Compared with prior art, the invention has the beneficial effects as follows: the present invention by way of multi-node parallel computing, by one The operand of individual large database distributes to multiple node databases, such that it is able to give full play to multimachine and multinuclear calculates simultaneously Ability, can be greatly enhanced the query or statistical rate in Volume data storehouse, with respect to the mode of configuration of optimizing hardware, this Invention will not be limited by room for promotion, and inquiry or Statistical Rate can improve 10 times, 100 times even 1000 times;The present invention Carry out load balancing using to the content inquired about, to each inquiry, first judge the node that this inquiry may access in advance, permissible Greatly reduce unnecessary parallel computation, under same hardware condition, can even tens times of raising whole system at double Concurrent access ability;Node server adopts common pc machine, required with respect to the optimization of master server hardware configuration Cost, on the premise of lifting identical inquiry or Statistical Rate, increases node server input cost less;The present invention disobeys Rely in special hardware and network, common pc machine and gigabit networking even 100,000,000 networks are it is achieved that cost performance is very high.
Brief description
Fig. 1 is the structural representation of the system carrying out load balance according to query contents.
Fig. 2 is the workflow diagram of the system carrying out load balance according to query contents.
Fig. 3 is the source database schematic diagram of big data quantity in the system carrying out load balance according to query contents.
Fig. 4 is the parsing schematic diagram of the source database of data volume in the system carrying out load balance according to query contents.
Wherein: 11- master server, 12- node server, 13- source database, 14- data splits server, the main place of 15- Reason device, 16- interim table memory cell, 17- node database, 18- modal processor, 19- front end display module, 20- node rope Draw memory cell, 21- thread allocation unit, 22- simplified element.
Specific embodiment
With reference to specific embodiment, the technical scheme of this patent is described in more detail.
Refer to Fig. 1-2, a kind of system carrying out load balance according to query contents, split server 14, master including data Server 11, node server 12 and front end display module 19, described master server 11 splits server 14, front with data respectively End display module 19 is connected with node server 12, and the quantity of node server 12 is two or more, and data splits server 14 It is connected with the source database 13 depositing mass data, data splits server 14 and the node with operation independent disposal ability It is connected by communication between server 12, master server 11 includes node index storage unit 20, thread allocation unit 21, simplified element 22nd, primary processor 15 and interim table memory cell 16, node index storage unit 20 is split server 14 respectively, is simplified with data Unit 22 and thread allocation unit 21 are connected, and thread allocation unit 21 is connected with node server 12, primary processor 15 with face When table memory cell 16 be connected.Node server 12 includes modal processor 18, modal processor 18 and node database 17 phase Even, node server 12 adopts common pc machine.By wired or nothing between data segmentation server 14 and node server 12 The mode of line is connected.
Specific embodiment 1
Refering to Fig. 3-4, Fig. 3 is the source database schematic diagram of a big data quantity.This source database includes four tables of data: Store table, sales table, time table and product table, data volume is 400,000,100,000,000,1825 and 1000 respectively.First have to source The data of database is split, and is assigned in each node database.Data volume ratio due to store table and sales table Larger, time table and product table data volume less, therefore to store table and sales table, are split by store field, Time table and product table are not split, and are copied directly to each node database.During partition data, city word can also be added Section, region field is ranked up, and the data in one city of guarantee or a region is in a node database or adjacent as far as possible On node database.
Form index information then according to segmentation rule.Assume to be split according to store title here, then formed Store title and the corresponding informance of node server.For convenience of description, now by store title be divided into store1, Store2, store3 ..., then index information can be represented with table 1:
Table 1
Node server Store title
n1 store1
n2 store2
n3 store3
And if when the data volume of index information is larger (specific name of such as store is long, or store quantity is more), Simplification process can be carried out to index information.The corresponding table of a upper strata Classifying Sum field for example can be produced, as table 2 institute Show:
Table 2
Node server Store title The top level domain of store title
n1 store1 a
n2 store2 b
n3 store3 c
Wherein, the top level domain in table 2 represents store title respectively with alphabetical a, b, b, and in addition, top level domain also may be used With using the abbreviation of store title, specific symbol etc., it act as simplifying search, reduces the time of parsing.
When user will access database, it is inquired about or statistical parameter parses, and combine index information, distribution is each The query or statistical task of individual node database.Inquiry described here or statistical parameter can be inquiry content, user right Deng.For example, when user needs the related data inquiring about store1 and store2, because the data of store1 and store2 is point It is not divided on two nodes of n1 and n2, therefore according to index information, system can be produced on producing two nodes of n1 and n2 Raw two threads carry out concurrent operation, without producing thread on n3 node, so have many user accesses data when simultaneously The operand of system during storehouse, will be greatly reduced.Certainly, if certain user is the director of store1, and it can only have inquiry During the authority of store1 related data, even if the inquiry content of this user includes store1 and store2, system can take in inquiry Hold the common factor (i.e. resolving) with user right, thus a thread is only produced on n1 node.
After being assigned with thread task, the data of each node database to be inquired about or be counted, each be saved Point data base execution sql instruction.Then the result set of each node is imported to interim table, inquired about again after being collected or unite Meter, the interim table importing being completed executes sql instruction again, such that it is able to obtain final result set.Finally by result set Pass to front end display module, shown using the various controls (such as form, figure) that represent
Above the better embodiment of this patent is explained in detail, but this patent is not limited to above-mentioned embodiment, In the ken that those of ordinary skill in the art possesses, can also make each on the premise of without departing from this patent objective Plant change.

Claims (5)

1. a kind of system carrying out load balance according to query contents is it is characterised in that include data segmentation server, main service Device, node server and front end display module, described master server splits server, front end display module and section with data respectively Point server is connected, and the quantity of node server is two or more, and data splits server and the source number depositing mass data It is connected according to storehouse, be connected by communication between data segmentation server and the node server with operation independent disposal ability, main service Device includes node index storage unit, thread allocation unit, simplified element, primary processor and interim table memory cell, node rope Draw memory cell to be connected with data segmentation server, simplified element and thread allocation unit respectively, thread allocation unit and section Point server is connected, and primary processor is connected with interim table memory cell.
2. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described node serve Device includes modal processor, and modal processor is connected with node database, and node server adopts common pc machine.
3. the system carrying out load balance according to query contents according to claim 1 is it is characterised in that described data is split It is connected by way of wired or wireless between server and node server.
4. a kind of method of work of the described system carrying out load balance according to query contents as arbitrary in claim 1-3, it is special Levy and be, specifically comprise the following steps that
Step one, arranges multiple node databases;
Step 2, the mass data in source database is split by data segmentation server according to rule, in partition data, If data content is little with the data volume of the corresponding informance of node server, can be directly as index information;And if number Larger with the data volume ratio of the corresponding informance of node server according to content, then it is time-consuming long to be likely to result in parsing, thus can profit With simplified element, index information is simplified, to improve the analyzing efficiency of thread allocation unit, then by the data after segmentation It is assigned in the node database of each node server;
Step 3, according to segmentation rule, formation during partition data represents the data content assigned by each node database Index information, index information is left in node index storage unit;
Step 4, according to segmentation rule, simplifies to index information;
Step 5, thread allocation unit parses to inquiry or statistical parameter, and combines the rope in node index storage unit Fuse ceases, and distributes the query or statistical task of each node database, finds out specific corresponding to each ad hoc inquiry or statistics Node;
Step 6, each modal processor carries out parallel query or counts and feed back to main service to each node database Device.
5. wherein, each node database all can carry out independent computing, thus each node database all can be shared A part of query or statistical task, and greatly improve the access efficiency of database;
Step 7, if the result set data volume that receives of master server less or node server 12 quantity few, The inquiry of node server or statistics directly can be transferred to front end display module by master server;And if node serve The data volume that the quantity of device is more or node server returns to master server is larger, then can inquiry or statistics multiple Make in interim table memory cell, and one interim table of generation is collected by interim table memory cell;
Step 8, master server is inquired about again to the information of interim table or is counted, and forms final result set, will be final Result set be transferred to front end display module, front end display module by forms such as the data genaration receiving figure, forms, and with Technician realizes interaction.
CN201610688791.0A 2016-08-19 2016-08-19 System and method for balancing load according to content to be inquired Pending CN106339432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610688791.0A CN106339432A (en) 2016-08-19 2016-08-19 System and method for balancing load according to content to be inquired

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610688791.0A CN106339432A (en) 2016-08-19 2016-08-19 System and method for balancing load according to content to be inquired

Publications (1)

Publication Number Publication Date
CN106339432A true CN106339432A (en) 2017-01-18

Family

ID=57824290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610688791.0A Pending CN106339432A (en) 2016-08-19 2016-08-19 System and method for balancing load according to content to be inquired

Country Status (1)

Country Link
CN (1) CN106339432A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109933719A (en) * 2019-01-30 2019-06-25 维沃移动通信有限公司 A kind of searching method and terminal device
WO2019128978A1 (en) * 2017-12-29 2019-07-04 阿里巴巴集团控股有限公司 Database system, and method and device for querying database
CN111309805A (en) * 2019-12-13 2020-06-19 华为技术有限公司 Data reading and writing method and device for database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101120340A (en) * 2004-02-21 2008-02-06 数据迅捷股份有限公司 Ultra-shared-nothing parallel database
CN101908075A (en) * 2010-08-17 2010-12-08 上海云数信息科技有限公司 SQL-based parallel computing system and method
CN101916280A (en) * 2010-08-17 2010-12-15 上海云数信息科技有限公司 Parallel computing system and method for carrying out load balance according to query contents
CN101916281A (en) * 2010-08-17 2010-12-15 上海云数信息科技有限公司 Concurrent computational system and non-repetition counting method
US20110125745A1 (en) * 2009-11-25 2011-05-26 Bmc Software, Inc. Balancing Data Across Partitions of a Table Space During Load Processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101120340A (en) * 2004-02-21 2008-02-06 数据迅捷股份有限公司 Ultra-shared-nothing parallel database
US20110125745A1 (en) * 2009-11-25 2011-05-26 Bmc Software, Inc. Balancing Data Across Partitions of a Table Space During Load Processing
CN101908075A (en) * 2010-08-17 2010-12-08 上海云数信息科技有限公司 SQL-based parallel computing system and method
CN101916280A (en) * 2010-08-17 2010-12-15 上海云数信息科技有限公司 Parallel computing system and method for carrying out load balance according to query contents
CN101916281A (en) * 2010-08-17 2010-12-15 上海云数信息科技有限公司 Concurrent computational system and non-repetition counting method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019128978A1 (en) * 2017-12-29 2019-07-04 阿里巴巴集团控股有限公司 Database system, and method and device for querying database
US11789957B2 (en) 2017-12-29 2023-10-17 Alibaba Group Holding Limited System, method, and apparatus for querying a database
CN109933719A (en) * 2019-01-30 2019-06-25 维沃移动通信有限公司 A kind of searching method and terminal device
CN111309805A (en) * 2019-12-13 2020-06-19 华为技术有限公司 Data reading and writing method and device for database
CN111309805B (en) * 2019-12-13 2023-10-20 华为技术有限公司 Data reading and writing method and device for database
US11868333B2 (en) 2019-12-13 2024-01-09 Huawei Technologies Co., Ltd. Data read/write method and apparatus for database

Similar Documents

Publication Publication Date Title
US20220405284A1 (en) Geo-scale analytics with bandwidth and regulatory constraints
US10963428B2 (en) Multi-range and runtime pruning
US20220253421A1 (en) Index Sharding
CN106547796B (en) Database execution method and device
US20200019552A1 (en) Query optimization method and related apparatus
US20170083573A1 (en) Multi-query optimization
US8538954B2 (en) Aggregate function partitions for distributed processing
US6801903B2 (en) Collecting statistics in a database system
CN101916280A (en) Parallel computing system and method for carrying out load balance according to query contents
WO2020135613A1 (en) Data query processing method, device and system, and computer-readable storage medium
CN110909111B (en) Distributed storage and indexing method based on RDF data characteristics of knowledge graph
CN101908075A (en) SQL-based parallel computing system and method
CN104123346A (en) Structural data searching method
US20100235344A1 (en) Mechanism for utilizing partitioning pruning techniques for xml indexes
Labouseur et al. Scalable and Robust Management of Dynamic Graph Data.
CN112015741A (en) Method and device for storing massive data in different databases and tables
US11809468B2 (en) Phrase indexing
US11797536B2 (en) Just-in-time injection in a distributed database
CN101916281B (en) Concurrent computational system and non-repetition counting method
CN106339432A (en) System and method for balancing load according to content to be inquired
US20230315701A1 (en) Data unification
Braganholo et al. A survey on xml fragmentation
Xu et al. Semantic connection set-based massive RDF data query processing in Spark environment
CN112818010B (en) Database query method and device
CN113742346A (en) Asset big data platform architecture optimization method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170118