CN106648897A - SOLR cluster extension method and system supporting resource balancing - Google Patents
SOLR cluster extension method and system supporting resource balancing Download PDFInfo
- Publication number
- CN106648897A CN106648897A CN201611234696.XA CN201611234696A CN106648897A CN 106648897 A CN106648897 A CN 106648897A CN 201611234696 A CN201611234696 A CN 201611234696A CN 106648897 A CN106648897 A CN 106648897A
- Authority
- CN
- China
- Prior art keywords
- solr
- burst
- core
- clusters
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1727—Details of free space management performed by the file system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
Abstract
The invention provides an SOLR cluster extension method and system supporting resource balancing. When new servers are added, new fragments are automatically created according to the increasing situation of the node number and the data volume in the servers, and then it is guaranteed that new data is warehoused to the corresponding servers in a balanced mode. According to the method and system, the servers can be automatically and flexibly added into a cluster according to the system performance and the data volume in the SOLR cluster, then the system can automatically create the fragments and a replication set according to the server performance and balance the data volume to corresponding SOLR Cores, manual fragment adding is not needed, and a local hot spot phenomenon cannot be caused.
Description
Technical field
The present invention relates to field of computer technology, and in particular to a kind of SOLR cluster expansions method for supporting balanced resource and
System.
Background technology
With the progress of society, the big data epoch are marched toward, the storage and retrieval of mass data have been applied to
Every field.Wherein full-text search belongs to wherein one of common function, the similar inquiry effect for realizing Baidu, Taobao.And SOLR
Belong to enterprise-level search application server most used in full-text search, possess feature richness, near real-time retrieval, support cluster
The features such as, and belong to the project under Apache, freely increase income.
The Clustering mechanism of SOLR itself is fairly perfect, supports burst (shard) using SOLRCloud components and replicates collection
(replication).When system data amount causes to a certain extent greatly server resource not enough, server is typically all increased newly
In being added to cluster, pressure is shared.SOLRCloud supports two kinds of fragmentation schemas:CompositeId and implicit.
CompositeId is to determine data based on ID calculation hash values to fall in which burst, just must be consolidated when creating collection
Determine burst quantity, be not suitable for subsequently increasing burst newly, therefore it is extending transversely to be not suitable for cluster dynamic.And implicit fragmentation schemas
Hold and specify piece key when burst is created, the value of piece key is arranged when inserting data data Cun Chudao which bursts determined.Therefore, may be used
So that by the operate interface of implicit fragmentation schemas, that realizes artificial or simple system adds burst to new demand servicing device automatically.
In prior art, the implicit fragmentation schemas of SOLRCloud are provided with the interface of dynamic addition burst, therefore one
Class method is the such as monthly burst simply by automatic dynamic burst on a time period, sets up a burst every month.At the beginning of the month
Automatically one burst of addition on a few server of data volume is selected within first day, for storing the data of next month.Although so
Auto plate separation is realized, but there are hot localised points.And every server resource in cluster may all, than
Such as there is the resources such as the cluster of the old and new's server hybrid combining, rotating speed, space size, the memory size of the old and new's server disk all
May be different, at this moment the quantity of burst cannot mean allocation.
The content of the invention
For this purpose, the present invention propose it is a kind of supports equilibrium resource SOLR cluster expansion method and system, add in new demand servicing device
It is fashionable, create new burst automatically according to the growth pattern of the nodes on server and data volume, it is ensured that new data is by equilibrium
Put in storage on corresponding server.The SOLR nodes that respective numbers are first installed according to server hardware resource of the invention, system is led to
The number of documents crossed in a thread dynamic monitoring cluster, according to circumstances dynamic creation burst.Warehouse-in thread is according to dynamic thread
The timeslicing parameters of middle adjustment, the data balancing of new warehouse-in are inserted in new burst.
Concrete scheme is as follows:
A kind of SOLR cluster expansion methods for supporting balanced resource, including step:
S10, according to the hardware resource of server SOLR nodes are installed;
S20, arranges the parameter of SOLR clusters;
S30, the shape of the parameter, current number of documents and current SOLR clusters of the SOLR clusters in SOLR clusters
State value dynamic creation burst, the state value of SOLR clusters is updated;
S40, the new document that the state value of the SOLR clusters after being updated according to step S30 will write is inserted into corresponding point
Piece, number of documents is updated;
Circulation execution step S10 to S40.
Further, described step S10 is specifically included:CPU, internal memory, disk space and the net of server are obtained respectively
Network bandwidth can allow the most SOLR nodes supported, therefrom acquisition minima is server can support that SOLR nodes take
Minima, by the quantity of the minima SOLR nodes are installed.
Further, the parameter of described SOLR clusters includes:Name, the i.e. title of set collection,
ConfigName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allow to create
The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes:
LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage
Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list,
NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes
Mesh.
Further, described step S30 is specifically included:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and
The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than
NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less,
Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into
Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing
Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu
LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor
On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X
Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is
liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create
Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described step S40 is specifically included:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely
The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread
Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange
Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
A kind of SOLR cluster expansion systems for supporting balanced resource, including:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current
SOLR clusters state value dynamic creation burst, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into phase
The burst answered, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and data insertion mould
Block.
Further, described node installation module is specifically additionally operable to:CPU, internal memory, the disk sky of server are obtained respectively
Between and the network bandwidth can allow most SOLR nodes of support, therefrom acquisition minima is server and can support that SOLR is saved
Points take minima, and by the quantity of the minima SOLR nodes are installed.
Further, the parameter of the SOLR clusters in the setup module includes:Name, the i.e. name of set collection
Title, configName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allows to create
The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes:
LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage
Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list,
NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes
Mesh.
Further, described burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and
The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than
NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less,
Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into
Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing
Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu
LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor
On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X
Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is
liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create
Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described Data insertion module is specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely
The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread
Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange
Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
Beneficial effect of the present invention:1) corresponding proportion can be stored to the performance load equilibrium in SOLR clusters according to server
Data volume, and support that according to target data volume automatically creates burst, extends cluster.Avoid a fragment data amount too big, also avoid
There is the problem of hot localised points when inserting in new data, while hold a concurrent post new server add after cluster, can automatic identification it is simultaneously right
SOLR Core numbers and number of files do load balancing;
2) propose a kind of dynamic model, with a thread to monitor SOLR in document data amount growth pattern, certain
In the case of dynamic create new Core according to nodes on server, and allow follow-up new insertion data distribution to new Core
In, realize the Dynamic Program Slicing and load balancing of SOLR clusters.
Description of the drawings
Fig. 1 is the flow chart of the dynamic creation burst of one embodiment of the invention;
Fig. 2 is the flow chart that document is inserted into corresponding burst of one embodiment of the invention.
Specific embodiment
To further illustrate each embodiment, the present invention is provided with accompanying drawing.These accompanying drawings are the invention discloses one of content
Point, it can coordinate the associated description of description to explain the operation principles of embodiment mainly to illustrate embodiment.Coordinate ginseng
These contents are examined, those of ordinary skill in the art will be understood that other possible embodiments and advantages of the present invention.Now tie
The present invention is further described to close the drawings and specific embodiments.
The SOLR cluster expansion methods of the support equilibrium resource of one embodiment of the invention specifically include following steps:
1. SOLR nodes are installed according to server hardware resource:
Corresponding data are physically stored with node (node) in SOLR clusters, will abundant profit on every server
With resource, the nodes of most multipotency installation must be first evaluated.For server resource, maximum supporting node quantity is relied primarily on
In the hardware device performance such as CPU, internal memory, disk space, network bandwidth, the dependence for each first rule of thumb provides one
Estimation function is used as computing formula.The estimation function of such as internal memory, can first draw memory size Sall total on server, so
The memory source Sother that operating system and other non-SOLR applications need is deducted afterwards, and thus drawing can distribute to SOLR nodes
Maximum memory source SSOLR=Sall-Sother.The memory source Snode needed divided by each node just can be evaluated
The most SOLR nodes supported are allowed from memory source, the calculating of the estimation function fmem of memory headroom supporting node number is public
Formula is as follows:
Fmem=(Sall-Sother)/Snode
Providing CPU, disk, network these equipment respectively according to the situation and service application of concrete system in the same manner can support
Estimation function fcpu, fdisk, fnet of SOLR nodes.Final server can support that SOLR nodes take minima fserver
=Min (fdisk, fcpu, fmem, fnet).SOLR nodes are installed according to estimate amount, and are added in SOLRCloud, give tacit consent to
First burst is not created.
2. the basic parameter of dynamic cluster is set:
In the dynamic cluster of SOLR, some basic parameters must be first set, such as which collection (name) is existed
On which node (serverNodes), which configuration (configName) burst is created with, creating burst there are several nodes
(replicationFactor).The interval numbering of some state values of its secondary record dynamic cluster, such as current slice numbering
(liveShardMinIndex~liveShardMaxIndex), the meansigma methodss of current each node distribution Core, newly next time
Increase number of files during burst, at present the number of the Core in all interdependent nodes.These parameters and dynamic value are with data base or XML
It is persisted in disk etc. form.
3. burst and adjustment warehouse-in strategy are created according to dynamic state of parameters, support dynamic cluster:
A thread is first created, timing detection number of files, according to number of documents and systematic parameter dynamic creation burst, is such as schemed
Shown in 1, main handling process is as follows:
The first step:The total number of documents of collection is read from SOLRCloud, judges whether to be more than
NextShardedDocNum, if greater than or equal to then representing need newly-increased burst, into step 2.If less than then entering
Step 13;
Second step:In order to prevent warehouse-in thread from reading incomplete data, mutual exclusion lock is added;
3rd step:Calculated according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum
The newly-increased Core quantity of current plan;
4th step:AddCoreNum is judged whether less than replicator replicationFactor, if less than then entering
Step 5, otherwise into step 7;
5th step:Adjustment liveCoreNumPerNode values Jia 1;
6th step:Calculated according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum
The newly-increased Core quantity of current plan;
7th step:Calculate what plan was increased newly according to formula addShard=addCoreNum/replicationFactor
Burst number, round numbers;
8th step:Cluster state is read from zookeeper, the SOLR Core numbers installed on each node are obtained
Amount, then compares with liveCoreNumPerNode values, and difference is exactly the most numbers for allowing to install Core;
9th step:According to the Core numbers for allowing to install on each node, a burst is created to replicationFactor
On individual Core, in order to provide disaster tolerance, this replicationFactor Core is distributed in different nodes as far as possible.Burst
Entitled shardX, the wherein value of X is one and is incremented by unduplicated integer, such as i-th newly-increased burst of current this wheel, then
The value of X is liveCoreNumPerNode+i;
Tenth step:Judge whether to have created addShard burst, if otherwise jump to step 9 to continue to create
Burst;If yes then enter step 11;
11st step:The state value in dynamic point storehouse is updated, and renewal is persisted to disk.
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
12nd step:Release mutual exclusion lock, it is allowed to continue to put in storage;
13rd step:Thread dormancy certain hour, detects again after waking up into step one;
It should be noted that:If new server adds cluster, first manually installed SOLR nodes and configuring are added to
In cluster, while changing serverNodes parameter values.New demand servicing device will not be created toward on new server at once after adding
Core, in order that data distribution is more balanced, but triggers, because new demand servicing until next round needs to create burst again
Node on device does not create Core, then most Core can be created in new server in a new wheel, so handle
Core equiblibrium mass distributions are on corresponding node.If new server adds cluster, the 4th step to jump directly in judging
7th step, otherwise just from the 4th step order go to the 7th step.
4. data parsing warehouse-in thread is inserted into corresponding Core according to the state value in dynamic point storehouse new data equilibrium
In, as shown in Fig. 2 main handling process is as follows:
The first step:In order to prevent reading incomplete data, first add mutual exclusion lock, be with the mutual exclusion lock of the thread in dynamic point storehouse
It is same;
Second step:Read parameter liveShardMinIndex and liveShardMaxIndex of Dynamic Program Slicing;
3rd step:Burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new data, piece key is set
Field is the burst title chosen;
4th step:Submit document to, thus new data is inserted in corresponding burst in a balanced way, while eliminating local
Focus.
In other one embodiment, the present invention proposes a kind of SOLR cluster expansion systems for supporting balanced resource, wraps
Include:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current
SOLR clusters state value dynamic creation burst, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into phase
The burst answered, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and data insertion mould
Block.
Further, described node installation module is specifically additionally operable to:CPU, internal memory, the disk sky of server are obtained respectively
Between and the network bandwidth can allow most SOLR nodes of support, therefrom acquisition minima is server and can support that SOLR is saved
Points take minima, and by the quantity of the minima SOLR nodes are installed.
Further, the parameter of the SOLR clusters in the setup module includes:Name, the i.e. name of set collection
Title, configName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allows to create
The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes:
LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage
Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list,
NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes
Mesh.
Further, described burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and
The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than
NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less,
Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into
Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated
The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing
Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu
LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor
On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X
Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is
liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create
Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described Data insertion module is specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely
The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread
Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange
Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
Can automatically in the cluster of SOLR, flexibly according to systematic function and data volume by above-mentioned method and system
To in cluster, then system can create burst and replicate collection addition server automatically according to server performance, and equalization data amount is arrived
In corresponding SOLR Core, without the need for manually adding burst, the phenomenon of hot localised points is not resulted in yet.
Although specifically showing and describing the present invention with reference to preferred embodiment, those skilled in the art should be bright
In vain, in the spirit and scope of the present invention limited without departing from appended claims, in the form and details can be right
The present invention makes a variety of changes, and is protection scope of the present invention.
Claims (10)
1. a kind of SOLR cluster expansion methods for supporting balanced resource, it is characterised in that including step:
S10, according to the hardware resource of server SOLR nodes are installed;
S20, arranges the parameter of SOLR clusters;
S30, the state value of the parameter, current number of documents and current SOLR clusters of the SOLR clusters in SOLR clusters
Dynamic creation burst, the state value of SOLR clusters is updated;
S40, the new document that the state value of the SOLR clusters after being updated according to step S30 will write is inserted into corresponding burst,
Number of documents is updated;
Circulation execution step S10 to S40.
2. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 1, it is characterised in that described
Step S10 is specifically included:Obtaining CPU, internal memory, disk space and the network bandwidth of server respectively can allow the most of support
SOLR nodes, therefrom acquisition minima is server can support that SOLR nodes take minima, by the quantity of the minima
SOLR nodes are installed.
3. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 1, it is characterised in that described
The parameter of SOLR clusters includes:Name's, the i.e. title of set collection, configName, i.e. set collection
Configuration name, serverNodes, i.e. set collection allow the node and replicationFactor for creating Core,
That is replicator, the state value of described SOLR clusters includes:LiveShardMinIndex, that is, put data storage burst row in storage
Minimum subscript, liveShardMaxIndex in table, that is, put in storage subscript maximum in data storage burst list,
The meansigma methodss of liveCoreNumPerNode, i.e. each node distribution Core, nextShardedDocNum, i.e. burst next time
When number of files and sumCoreNum, i.e., the number of Core in all nodes.
4. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 3, it is characterised in that described
Step S30 is specifically included:
S300, creates first thread, and the number of documents of the first thread monitoring SOLR clusters simultaneously performs S301 and S303 extremely
The step of S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than
NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less,
Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated
Draw newly-increased Core quantity;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into step
S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated
Draw newly-increased Core quantity;
S307, according to formula addShard=addCoreNum/replicationFactor the newly-increased burst of plan is calculated
Number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu
LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor Core
On, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, the wherein value of X
It is incremented by unduplicated integer for one, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+
i;
S310, judges whether to have created addShard burst, divides if it is not, then jumping to step S309 and continuing to create
Piece, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
5. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 4, it is characterised in that described
Step S40 is specifically included:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402's to S404
Step, by document corresponding burst is inserted into, and the mutual exclusion lock is the mutual exclusion lock identical mutual exclusion lock created with first thread;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange
Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
6. a kind of SOLR cluster expansion systems for supporting balanced resource, it is characterised in that include:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current
The state value dynamic creation burst of SOLR clusters, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into accordingly
Burst, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and Data insertion module.
7. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 6, it is characterised in that described
Node installation module is specifically additionally operable to:Obtaining CPU, internal memory, disk space and the network bandwidth of server respectively can allow to prop up
The most SOLR nodes held, therefrom acquisition minima is server can support that SOLR nodes take minima, by the minimum
The quantity of value installs SOLR nodes.
8. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 6, it is characterised in that described to set
Putting the parameter of the SOLR clusters in module includes:Name, the i.e. title of set collection, configName, that is, gather
The configuration name of collection, serverNodes, i.e. set collection allow create Core node and
ReplicationFactor, i.e. replicator, the state value of described SOLR clusters includes:LiveShardMinIndex, i.e.,
Minimum subscript, liveShardMaxIndex in warehouse-in data storage burst list, that is, put in storage in data storage burst list most
Big subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core, nextShardedDocNum, i.e., under
Number of files and sumCoreNum during burst, i.e., the number of Core in all nodes.
9. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 8, it is characterised in that described
Burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, and the number of documents of the first thread monitoring SOLR clusters simultaneously performs S301 and S303 extremely
The step of S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than
NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less,
Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated
Draw newly-increased Core quantity;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into step
S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated
Draw newly-increased Core quantity;
S307, according to formula addShard=addCoreNum/replicationFactor the newly-increased burst of plan is calculated
Number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu
LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor Core
On, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, the wherein value of X
It is incremented by unduplicated integer for one, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+
i;
S310, judges whether to have created addShard burst, divides if it is not, then jumping to step S309 and continuing to create
Piece, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
10. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 9, it is characterised in that described
Data insertion module be specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402's to S404
Step, by document corresponding burst is inserted into, and the mutual exclusion lock is the mutual exclusion lock identical mutual exclusion lock created with first thread;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is
The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange
Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611234696.XA CN106648897B (en) | 2016-12-28 | 2016-12-28 | A kind of SOLR cluster expansion method and system for supporting balanced resource |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611234696.XA CN106648897B (en) | 2016-12-28 | 2016-12-28 | A kind of SOLR cluster expansion method and system for supporting balanced resource |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106648897A true CN106648897A (en) | 2017-05-10 |
CN106648897B CN106648897B (en) | 2019-11-22 |
Family
ID=58832157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611234696.XA Active CN106648897B (en) | 2016-12-28 | 2016-12-28 | A kind of SOLR cluster expansion method and system for supporting balanced resource |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106648897B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107544848A (en) * | 2017-08-30 | 2018-01-05 | 深圳云天励飞技术有限公司 | Cluster expansion method, apparatus, electronic equipment and storage medium |
CN107566531A (en) * | 2017-10-17 | 2018-01-09 | 厦门市美亚柏科信息股份有限公司 | A kind of Elasticsearch cluster expansion methods for supporting balanced resource |
CN109241085A (en) * | 2018-09-20 | 2019-01-18 | 潘丽华 | A kind of big data SQL query method for SolrCloud |
CN111124696A (en) * | 2019-12-30 | 2020-05-08 | 北京三快在线科技有限公司 | Unit group creation method, unit group creation device, unit group data synchronization method, unit group data synchronization device, unit and storage medium |
CN111914022A (en) * | 2020-07-23 | 2020-11-10 | 北京中数智汇科技股份有限公司 | Method and device for online capacity expansion of mongodb cluster |
CN111949494A (en) * | 2020-09-16 | 2020-11-17 | 北京浪潮数据技术有限公司 | Task regulation and control method, device and related equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102591934A (en) * | 2011-12-23 | 2012-07-18 | 国网电力科学研究院 | Zookeeper-based method for realizing automatic expansion and switching of multiple Solr Shards |
CN103488702A (en) * | 2013-09-06 | 2014-01-01 | 云南电力试验研究院(集团)有限公司电力研究院 | SorlCloud based unstructured data retrieval method and system |
CN103701633A (en) * | 2013-12-09 | 2014-04-02 | 国家电网公司 | Setup and maintenance system of visual cluster application for distributed search SolrCloud |
CN104035836A (en) * | 2013-03-06 | 2014-09-10 | 阿里巴巴集团控股有限公司 | Automatic disaster tolerance recovery method and system in cluster retrieval platform |
-
2016
- 2016-12-28 CN CN201611234696.XA patent/CN106648897B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102591934A (en) * | 2011-12-23 | 2012-07-18 | 国网电力科学研究院 | Zookeeper-based method for realizing automatic expansion and switching of multiple Solr Shards |
CN104035836A (en) * | 2013-03-06 | 2014-09-10 | 阿里巴巴集团控股有限公司 | Automatic disaster tolerance recovery method and system in cluster retrieval platform |
CN103488702A (en) * | 2013-09-06 | 2014-01-01 | 云南电力试验研究院(集团)有限公司电力研究院 | SorlCloud based unstructured data retrieval method and system |
CN103701633A (en) * | 2013-12-09 | 2014-04-02 | 国家电网公司 | Setup and maintenance system of visual cluster application for distributed search SolrCloud |
Non-Patent Citations (5)
Title |
---|
JAYANT KUMAR: "《Apache Solr Search Patterns》", 30 April 2015, PACKT PUBLISHING LTD. * |
KHALED NAGI: "Bringing search engines to the cloud using open source components", 《2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K)》 * |
李戴维 等: "基于Solr的分布式全文检索系统的研究与实现", 《计算机与现代化》 * |
李聪颖 等: "大数据分布式全文检索系统的设计与实现", 《计算机与数字工程》 * |
赵璞 等: "高性能分布式搜索引擎Solr的研究与实现", 《电子科技》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107544848A (en) * | 2017-08-30 | 2018-01-05 | 深圳云天励飞技术有限公司 | Cluster expansion method, apparatus, electronic equipment and storage medium |
CN107566531A (en) * | 2017-10-17 | 2018-01-09 | 厦门市美亚柏科信息股份有限公司 | A kind of Elasticsearch cluster expansion methods for supporting balanced resource |
CN107566531B (en) * | 2017-10-17 | 2020-07-10 | 厦门市美亚柏科信息股份有限公司 | Elasticisearch cluster expansion method supporting balanced resources |
CN109241085A (en) * | 2018-09-20 | 2019-01-18 | 潘丽华 | A kind of big data SQL query method for SolrCloud |
CN109241085B (en) * | 2018-09-20 | 2022-06-21 | 郴州职业技术学院 | Big data SQL query method for SolrCloud |
CN111124696A (en) * | 2019-12-30 | 2020-05-08 | 北京三快在线科技有限公司 | Unit group creation method, unit group creation device, unit group data synchronization method, unit group data synchronization device, unit and storage medium |
CN111124696B (en) * | 2019-12-30 | 2023-06-23 | 北京三快在线科技有限公司 | Unit group creation, data synchronization method, device, unit and storage medium |
CN111914022A (en) * | 2020-07-23 | 2020-11-10 | 北京中数智汇科技股份有限公司 | Method and device for online capacity expansion of mongodb cluster |
CN111949494A (en) * | 2020-09-16 | 2020-11-17 | 北京浪潮数据技术有限公司 | Task regulation and control method, device and related equipment |
CN111949494B (en) * | 2020-09-16 | 2022-06-10 | 北京浪潮数据技术有限公司 | Task regulation and control method, device and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN106648897B (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106648897A (en) | SOLR cluster extension method and system supporting resource balancing | |
CN107688999B (en) | Block chain-based parallel transaction execution method | |
US20220214995A1 (en) | Blockchain data archiving method, apparatus, and computer-readable storage medium | |
CN109885316B (en) | Hdfs-hbase deployment method and device based on kubernetes | |
CN107566531A (en) | A kind of Elasticsearch cluster expansion methods for supporting balanced resource | |
CN107967316A (en) | A kind of method of data synchronization, equipment and computer-readable recording medium | |
US9009653B2 (en) | Identifying quality requirements of a software product | |
CN112579692B (en) | Data synchronization method, device, system, equipment and storage medium | |
CN102624865A (en) | Cluster load prediction method and distributed cluster management system | |
CN105577763A (en) | Dynamic duplicate consistency maintenance system and method, and cloud storage platform | |
CN106372160A (en) | Distributive database and management method | |
CN105678118A (en) | Generation method and device for software versions with digital certificate | |
US20150046399A1 (en) | Computer system, data allocation management method, and program | |
US9537941B2 (en) | Method and system for verifying quality of server | |
CN105357306A (en) | Multi-platform data sharing system and data sharing method therefor | |
CN110928860B (en) | Data migration method and device | |
US8584117B2 (en) | Method to make SMP/E based products self describing | |
CN115495465A (en) | Data updating method and device and electronic equipment | |
CN103761248B (en) | The method and system of data query are carried out using memory database | |
Ovando-Leon et al. | A simulation tool for a large-scale nosql database | |
CN111917826A (en) | PBFT consensus algorithm based on block chain intellectual property protection | |
CN110851515A (en) | Big data ETL model execution method and medium based on Spark distributed environment | |
CN103927264A (en) | Method for distributing running memory space of map data of airborne digital map software | |
CN106155801B (en) | A kind of method and resource management center of equipment transportation | |
US11928612B1 (en) | Fixing a changing weave using a finalize node |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |