CN106648897A - SOLR cluster extension method and system supporting resource balancing - Google Patents

SOLR cluster extension method and system supporting resource balancing Download PDF

Info

Publication number
CN106648897A
CN106648897A CN201611234696.XA CN201611234696A CN106648897A CN 106648897 A CN106648897 A CN 106648897A CN 201611234696 A CN201611234696 A CN 201611234696A CN 106648897 A CN106648897 A CN 106648897A
Authority
CN
China
Prior art keywords
solr
burst
core
clusters
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611234696.XA
Other languages
Chinese (zh)
Other versions
CN106648897B (en
Inventor
曾超
温若辉
赵庸
林艺滨
江汉祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN201611234696.XA priority Critical patent/CN106648897B/en
Publication of CN106648897A publication Critical patent/CN106648897A/en
Application granted granted Critical
Publication of CN106648897B publication Critical patent/CN106648897B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1727Details of free space management performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Abstract

The invention provides an SOLR cluster extension method and system supporting resource balancing. When new servers are added, new fragments are automatically created according to the increasing situation of the node number and the data volume in the servers, and then it is guaranteed that new data is warehoused to the corresponding servers in a balanced mode. According to the method and system, the servers can be automatically and flexibly added into a cluster according to the system performance and the data volume in the SOLR cluster, then the system can automatically create the fragments and a replication set according to the server performance and balance the data volume to corresponding SOLR Cores, manual fragment adding is not needed, and a local hot spot phenomenon cannot be caused.

Description

A kind of SOLR cluster expansion method and system for supporting balanced resource
Technical field
The present invention relates to field of computer technology, and in particular to a kind of SOLR cluster expansions method for supporting balanced resource and System.
Background technology
With the progress of society, the big data epoch are marched toward, the storage and retrieval of mass data have been applied to Every field.Wherein full-text search belongs to wherein one of common function, the similar inquiry effect for realizing Baidu, Taobao.And SOLR Belong to enterprise-level search application server most used in full-text search, possess feature richness, near real-time retrieval, support cluster The features such as, and belong to the project under Apache, freely increase income.
The Clustering mechanism of SOLR itself is fairly perfect, supports burst (shard) using SOLRCloud components and replicates collection (replication).When system data amount causes to a certain extent greatly server resource not enough, server is typically all increased newly In being added to cluster, pressure is shared.SOLRCloud supports two kinds of fragmentation schemas:CompositeId and implicit. CompositeId is to determine data based on ID calculation hash values to fall in which burst, just must be consolidated when creating collection Determine burst quantity, be not suitable for subsequently increasing burst newly, therefore it is extending transversely to be not suitable for cluster dynamic.And implicit fragmentation schemas Hold and specify piece key when burst is created, the value of piece key is arranged when inserting data data Cun Chudao which bursts determined.Therefore, may be used So that by the operate interface of implicit fragmentation schemas, that realizes artificial or simple system adds burst to new demand servicing device automatically.
In prior art, the implicit fragmentation schemas of SOLRCloud are provided with the interface of dynamic addition burst, therefore one Class method is the such as monthly burst simply by automatic dynamic burst on a time period, sets up a burst every month.At the beginning of the month Automatically one burst of addition on a few server of data volume is selected within first day, for storing the data of next month.Although so Auto plate separation is realized, but there are hot localised points.And every server resource in cluster may all, than Such as there is the resources such as the cluster of the old and new's server hybrid combining, rotating speed, space size, the memory size of the old and new's server disk all May be different, at this moment the quantity of burst cannot mean allocation.
The content of the invention
For this purpose, the present invention propose it is a kind of supports equilibrium resource SOLR cluster expansion method and system, add in new demand servicing device It is fashionable, create new burst automatically according to the growth pattern of the nodes on server and data volume, it is ensured that new data is by equilibrium Put in storage on corresponding server.The SOLR nodes that respective numbers are first installed according to server hardware resource of the invention, system is led to The number of documents crossed in a thread dynamic monitoring cluster, according to circumstances dynamic creation burst.Warehouse-in thread is according to dynamic thread The timeslicing parameters of middle adjustment, the data balancing of new warehouse-in are inserted in new burst.
Concrete scheme is as follows:
A kind of SOLR cluster expansion methods for supporting balanced resource, including step:
S10, according to the hardware resource of server SOLR nodes are installed;
S20, arranges the parameter of SOLR clusters;
S30, the shape of the parameter, current number of documents and current SOLR clusters of the SOLR clusters in SOLR clusters State value dynamic creation burst, the state value of SOLR clusters is updated;
S40, the new document that the state value of the SOLR clusters after being updated according to step S30 will write is inserted into corresponding point Piece, number of documents is updated;
Circulation execution step S10 to S40.
Further, described step S10 is specifically included:CPU, internal memory, disk space and the net of server are obtained respectively Network bandwidth can allow the most SOLR nodes supported, therefrom acquisition minima is server can support that SOLR nodes take Minima, by the quantity of the minima SOLR nodes are installed.
Further, the parameter of described SOLR clusters includes:Name, the i.e. title of set collection, ConfigName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allow to create The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes: LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list, NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes Mesh.
Further, described step S30 is specifically included:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less, Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described step S40 is specifically included:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
A kind of SOLR cluster expansion systems for supporting balanced resource, including:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current SOLR clusters state value dynamic creation burst, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into phase The burst answered, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and data insertion mould Block.
Further, described node installation module is specifically additionally operable to:CPU, internal memory, the disk sky of server are obtained respectively Between and the network bandwidth can allow most SOLR nodes of support, therefrom acquisition minima is server and can support that SOLR is saved Points take minima, and by the quantity of the minima SOLR nodes are installed.
Further, the parameter of the SOLR clusters in the setup module includes:Name, the i.e. name of set collection Title, configName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allows to create The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes: LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list, NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes Mesh.
Further, described burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less, Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described Data insertion module is specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
Beneficial effect of the present invention:1) corresponding proportion can be stored to the performance load equilibrium in SOLR clusters according to server Data volume, and support that according to target data volume automatically creates burst, extends cluster.Avoid a fragment data amount too big, also avoid There is the problem of hot localised points when inserting in new data, while hold a concurrent post new server add after cluster, can automatic identification it is simultaneously right SOLR Core numbers and number of files do load balancing;
2) propose a kind of dynamic model, with a thread to monitor SOLR in document data amount growth pattern, certain In the case of dynamic create new Core according to nodes on server, and allow follow-up new insertion data distribution to new Core In, realize the Dynamic Program Slicing and load balancing of SOLR clusters.
Description of the drawings
Fig. 1 is the flow chart of the dynamic creation burst of one embodiment of the invention;
Fig. 2 is the flow chart that document is inserted into corresponding burst of one embodiment of the invention.
Specific embodiment
To further illustrate each embodiment, the present invention is provided with accompanying drawing.These accompanying drawings are the invention discloses one of content Point, it can coordinate the associated description of description to explain the operation principles of embodiment mainly to illustrate embodiment.Coordinate ginseng These contents are examined, those of ordinary skill in the art will be understood that other possible embodiments and advantages of the present invention.Now tie The present invention is further described to close the drawings and specific embodiments.
The SOLR cluster expansion methods of the support equilibrium resource of one embodiment of the invention specifically include following steps:
1. SOLR nodes are installed according to server hardware resource:
Corresponding data are physically stored with node (node) in SOLR clusters, will abundant profit on every server With resource, the nodes of most multipotency installation must be first evaluated.For server resource, maximum supporting node quantity is relied primarily on In the hardware device performance such as CPU, internal memory, disk space, network bandwidth, the dependence for each first rule of thumb provides one Estimation function is used as computing formula.The estimation function of such as internal memory, can first draw memory size Sall total on server, so The memory source Sother that operating system and other non-SOLR applications need is deducted afterwards, and thus drawing can distribute to SOLR nodes Maximum memory source SSOLR=Sall-Sother.The memory source Snode needed divided by each node just can be evaluated The most SOLR nodes supported are allowed from memory source, the calculating of the estimation function fmem of memory headroom supporting node number is public Formula is as follows:
Fmem=(Sall-Sother)/Snode
Providing CPU, disk, network these equipment respectively according to the situation and service application of concrete system in the same manner can support Estimation function fcpu, fdisk, fnet of SOLR nodes.Final server can support that SOLR nodes take minima fserver =Min (fdisk, fcpu, fmem, fnet).SOLR nodes are installed according to estimate amount, and are added in SOLRCloud, give tacit consent to First burst is not created.
2. the basic parameter of dynamic cluster is set:
In the dynamic cluster of SOLR, some basic parameters must be first set, such as which collection (name) is existed On which node (serverNodes), which configuration (configName) burst is created with, creating burst there are several nodes (replicationFactor).The interval numbering of some state values of its secondary record dynamic cluster, such as current slice numbering (liveShardMinIndex~liveShardMaxIndex), the meansigma methodss of current each node distribution Core, newly next time Increase number of files during burst, at present the number of the Core in all interdependent nodes.These parameters and dynamic value are with data base or XML It is persisted in disk etc. form.
3. burst and adjustment warehouse-in strategy are created according to dynamic state of parameters, support dynamic cluster:
A thread is first created, timing detection number of files, according to number of documents and systematic parameter dynamic creation burst, is such as schemed Shown in 1, main handling process is as follows:
The first step:The total number of documents of collection is read from SOLRCloud, judges whether to be more than NextShardedDocNum, if greater than or equal to then representing need newly-increased burst, into step 2.If less than then entering Step 13;
Second step:In order to prevent warehouse-in thread from reading incomplete data, mutual exclusion lock is added;
3rd step:Calculated according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum The newly-increased Core quantity of current plan;
4th step:AddCoreNum is judged whether less than replicator replicationFactor, if less than then entering Step 5, otherwise into step 7;
5th step:Adjustment liveCoreNumPerNode values Jia 1;
6th step:Calculated according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum The newly-increased Core quantity of current plan;
7th step:Calculate what plan was increased newly according to formula addShard=addCoreNum/replicationFactor Burst number, round numbers;
8th step:Cluster state is read from zookeeper, the SOLR Core numbers installed on each node are obtained Amount, then compares with liveCoreNumPerNode values, and difference is exactly the most numbers for allowing to install Core;
9th step:According to the Core numbers for allowing to install on each node, a burst is created to replicationFactor On individual Core, in order to provide disaster tolerance, this replicationFactor Core is distributed in different nodes as far as possible.Burst Entitled shardX, the wherein value of X is one and is incremented by unduplicated integer, such as i-th newly-increased burst of current this wheel, then The value of X is liveCoreNumPerNode+i;
Tenth step:Judge whether to have created addShard burst, if otherwise jump to step 9 to continue to create Burst;If yes then enter step 11;
11st step:The state value in dynamic point storehouse is updated, and renewal is persisted to disk.
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
12nd step:Release mutual exclusion lock, it is allowed to continue to put in storage;
13rd step:Thread dormancy certain hour, detects again after waking up into step one;
It should be noted that:If new server adds cluster, first manually installed SOLR nodes and configuring are added to In cluster, while changing serverNodes parameter values.New demand servicing device will not be created toward on new server at once after adding Core, in order that data distribution is more balanced, but triggers, because new demand servicing until next round needs to create burst again Node on device does not create Core, then most Core can be created in new server in a new wheel, so handle Core equiblibrium mass distributions are on corresponding node.If new server adds cluster, the 4th step to jump directly in judging 7th step, otherwise just from the 4th step order go to the 7th step.
4. data parsing warehouse-in thread is inserted into corresponding Core according to the state value in dynamic point storehouse new data equilibrium In, as shown in Fig. 2 main handling process is as follows:
The first step:In order to prevent reading incomplete data, first add mutual exclusion lock, be with the mutual exclusion lock of the thread in dynamic point storehouse It is same;
Second step:Read parameter liveShardMinIndex and liveShardMaxIndex of Dynamic Program Slicing;
3rd step:Burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new data, piece key is set Field is the burst title chosen;
4th step:Submit document to, thus new data is inserted in corresponding burst in a balanced way, while eliminating local Focus.
In other one embodiment, the present invention proposes a kind of SOLR cluster expansion systems for supporting balanced resource, wraps Include:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current SOLR clusters state value dynamic creation burst, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into phase The burst answered, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and data insertion mould Block.
Further, described node installation module is specifically additionally operable to:CPU, internal memory, the disk sky of server are obtained respectively Between and the network bandwidth can allow most SOLR nodes of support, therefrom acquisition minima is server and can support that SOLR is saved Points take minima, and by the quantity of the minima SOLR nodes are installed.
Further, the parameter of the SOLR clusters in the setup module includes:Name, the i.e. name of set collection Title, configName, the i.e. configuration name of set collection, serverNodes, i.e. set collection allows to create The node and replicationFactor of Core, i.e. replicator, the dynamic value of described SOLR clusters includes: LiveShardMinIndex, that is, put subscript, liveShardMaxIndex minimum in data storage burst list in storage, that is, put in storage Maximum subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core in data storage burst list, NextShardedDocNum, i.e., next time burst when number of files and sumCoreNum, i.e., the number of Core in all nodes Mesh.
Further, described burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, the number of documents of first thread monitoring SOLR clusters and perform S301 and The step of S303 to S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less, Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into Step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this is calculated The newly-increased Core quantity of secondary plan;
S307, calculates according to formula addShard=addCoreNum/replicationFactor and plans newly-increased dividing Piece number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor On Core, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, wherein X Value is one and is incremented by unduplicated integer, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+i;
S310, judges whether to have created addShard burst, if it is not, then jump to step S309 to continue to create Burst, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
Further, described Data insertion module is specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402 extremely The step of S404, document is inserted into into corresponding burst, the mutual exclusion lock is the mutual exclusion lock identical created with first thread Mutual exclusion lock;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
Can automatically in the cluster of SOLR, flexibly according to systematic function and data volume by above-mentioned method and system To in cluster, then system can create burst and replicate collection addition server automatically according to server performance, and equalization data amount is arrived In corresponding SOLR Core, without the need for manually adding burst, the phenomenon of hot localised points is not resulted in yet.
Although specifically showing and describing the present invention with reference to preferred embodiment, those skilled in the art should be bright In vain, in the spirit and scope of the present invention limited without departing from appended claims, in the form and details can be right The present invention makes a variety of changes, and is protection scope of the present invention.

Claims (10)

1. a kind of SOLR cluster expansion methods for supporting balanced resource, it is characterised in that including step:
S10, according to the hardware resource of server SOLR nodes are installed;
S20, arranges the parameter of SOLR clusters;
S30, the state value of the parameter, current number of documents and current SOLR clusters of the SOLR clusters in SOLR clusters Dynamic creation burst, the state value of SOLR clusters is updated;
S40, the new document that the state value of the SOLR clusters after being updated according to step S30 will write is inserted into corresponding burst, Number of documents is updated;
Circulation execution step S10 to S40.
2. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 1, it is characterised in that described Step S10 is specifically included:Obtaining CPU, internal memory, disk space and the network bandwidth of server respectively can allow the most of support SOLR nodes, therefrom acquisition minima is server can support that SOLR nodes take minima, by the quantity of the minima SOLR nodes are installed.
3. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 1, it is characterised in that described The parameter of SOLR clusters includes:Name's, the i.e. title of set collection, configName, i.e. set collection Configuration name, serverNodes, i.e. set collection allow the node and replicationFactor for creating Core, That is replicator, the state value of described SOLR clusters includes:LiveShardMinIndex, that is, put data storage burst row in storage Minimum subscript, liveShardMaxIndex in table, that is, put in storage subscript maximum in data storage burst list, The meansigma methodss of liveCoreNumPerNode, i.e. each node distribution Core, nextShardedDocNum, i.e. burst next time When number of files and sumCoreNum, i.e., the number of Core in all nodes.
4. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 3, it is characterised in that described Step S30 is specifically included:
S300, creates first thread, and the number of documents of the first thread monitoring SOLR clusters simultaneously performs S301 and S303 extremely The step of S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less, Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated Draw newly-increased Core quantity;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated Draw newly-increased Core quantity;
S307, according to formula addShard=addCoreNum/replicationFactor the newly-increased burst of plan is calculated Number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor Core On, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, the wherein value of X It is incremented by unduplicated integer for one, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+ i;
S310, judges whether to have created addShard burst, divides if it is not, then jumping to step S309 and continuing to create Piece, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
5. a kind of SOLR cluster expansion methods for supporting balanced resource according to claim 4, it is characterised in that described Step S40 is specifically included:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402's to S404 Step, by document corresponding burst is inserted into, and the mutual exclusion lock is the mutual exclusion lock identical mutual exclusion lock created with first thread;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
6. a kind of SOLR cluster expansion systems for supporting balanced resource, it is characterised in that include:
Node installation module, for installing SOLR nodes according to the hardware resource of server;
Setup module, for arranging the parameter of SOLR clusters;
Burst creation module, for the parameter of the SOLR clusters in SOLR clusters, current number of documents and current The state value dynamic creation burst of SOLR clusters, the state value of SOLR clusters is updated;
Data insertion module, the new document for being write according to the state value of the SOLR clusters after renewal is inserted into accordingly Burst, number of documents is updated;
Loop module, for being recycled into node installation module, setup module, burst creation module and Data insertion module.
7. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 6, it is characterised in that described Node installation module is specifically additionally operable to:Obtaining CPU, internal memory, disk space and the network bandwidth of server respectively can allow to prop up The most SOLR nodes held, therefrom acquisition minima is server can support that SOLR nodes take minima, by the minimum The quantity of value installs SOLR nodes.
8. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 6, it is characterised in that described to set Putting the parameter of the SOLR clusters in module includes:Name, the i.e. title of set collection, configName, that is, gather The configuration name of collection, serverNodes, i.e. set collection allow create Core node and ReplicationFactor, i.e. replicator, the state value of described SOLR clusters includes:LiveShardMinIndex, i.e., Minimum subscript, liveShardMaxIndex in warehouse-in data storage burst list, that is, put in storage in data storage burst list most Big subscript, liveCoreNumPerNode, the i.e. meansigma methodss of each node distribution Core, nextShardedDocNum, i.e., under Number of files and sumCoreNum during burst, i.e., the number of Core in all nodes.
9. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 8, it is characterised in that described Burst creation module is specifically additionally operable to perform following steps:
S300, creates first thread, and the number of documents of the first thread monitoring SOLR clusters simultaneously performs S301 and S303 extremely The step of S311, to realize dynamic creation burst;
S301, reads number of documents set collection total in SOLR clusters, judges whether to be more than NextShardedDocNum, if greater than or be equal to, then it represents that need newly-increased burst, into step S302, if it is less, Into step S313;
S302, to first thread mutual exclusion lock is created;
S303, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated Draw newly-increased Core quantity;
Whether S304, judge addCoreNum less than replicator replicationFactor, if it is less, into step S305, otherwise into step S307;
S305, adjustment liveCoreNumPerNode values Jia 1;
S306, according to formula addCoreNum=liveCoreNumPerNode*Nodes-sumCoreNum this time meter is calculated Draw newly-increased Core quantity;
S307, according to formula addShard=addCoreNum/replicationFactor the newly-increased burst of plan is calculated Number, round numbers;
S308, reads SOLR cluster state values, obtains the SOLR Core quantity installed on each node, Ran Houyu LiveCoreNumPerNode values compare, and difference as allows the most numbers for installing Core;
S309, according to the Core numbers for allowing to install on each node, creates a burst to replicationFactor Core On, the replicationFactor Core is distributed in different nodes, the entitled shardX of burst, the wherein value of X It is incremented by unduplicated integer for one, if i-th newly-increased burst of current this wheel, the value of X is liveCoreNumPerNode+ i;
S310, judges whether to have created addShard burst, divides if it is not, then jumping to step S309 and continuing to create Piece, if yes then enter step S311;
S311, updates SOLR cluster state values, and renewal is persisted to disk, and state value updates as follows:
LiveShardMinIndex=liveShardMaxIndex+1;
LiveShardMaxIndex=liveShardMaxIndex+addShard;
SumCoreNum=sumCoreNum+addCoreNum;
NextShardedDocNum=sumCoreNum*docNumPerCore;
S312, discharges mutual exclusion lock;
S313, first thread dormancy certain hour, detects again after waking up into step S301.
10. a kind of SOLR cluster expansion systems for supporting balanced resource according to claim 9, it is characterised in that described Data insertion module be specifically additionally operable to perform following steps:
S401, creates the second thread, and to the second thread creation mutual exclusion lock, second thread performs following S402's to S404 Step, by document corresponding burst is inserted into, and the mutual exclusion lock is the mutual exclusion lock identical mutual exclusion lock created with first thread;
S402, reads parameter liveShardMinIndex and liveShardMaxIndex of SOLR clusters;
S403, burst is randomly choosed in shardI~shardJ in burst list, and (value of I is The value of liveShardMinIndex, J is liveShardMaxIndex) as the warehouse-in target burst of new document data, arrange Piece key field is the burst title chosen;
S404, submits document to, and new document data is inserted in corresponding burst in a balanced way.
CN201611234696.XA 2016-12-28 2016-12-28 A kind of SOLR cluster expansion method and system for supporting balanced resource Active CN106648897B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611234696.XA CN106648897B (en) 2016-12-28 2016-12-28 A kind of SOLR cluster expansion method and system for supporting balanced resource

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611234696.XA CN106648897B (en) 2016-12-28 2016-12-28 A kind of SOLR cluster expansion method and system for supporting balanced resource

Publications (2)

Publication Number Publication Date
CN106648897A true CN106648897A (en) 2017-05-10
CN106648897B CN106648897B (en) 2019-11-22

Family

ID=58832157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611234696.XA Active CN106648897B (en) 2016-12-28 2016-12-28 A kind of SOLR cluster expansion method and system for supporting balanced resource

Country Status (1)

Country Link
CN (1) CN106648897B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107544848A (en) * 2017-08-30 2018-01-05 深圳云天励飞技术有限公司 Cluster expansion method, apparatus, electronic equipment and storage medium
CN107566531A (en) * 2017-10-17 2018-01-09 厦门市美亚柏科信息股份有限公司 A kind of Elasticsearch cluster expansion methods for supporting balanced resource
CN109241085A (en) * 2018-09-20 2019-01-18 潘丽华 A kind of big data SQL query method for SolrCloud
CN111124696A (en) * 2019-12-30 2020-05-08 北京三快在线科技有限公司 Unit group creation method, unit group creation device, unit group data synchronization method, unit group data synchronization device, unit and storage medium
CN111914022A (en) * 2020-07-23 2020-11-10 北京中数智汇科技股份有限公司 Method and device for online capacity expansion of mongodb cluster
CN111949494A (en) * 2020-09-16 2020-11-17 北京浪潮数据技术有限公司 Task regulation and control method, device and related equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591934A (en) * 2011-12-23 2012-07-18 国网电力科学研究院 Zookeeper-based method for realizing automatic expansion and switching of multiple Solr Shards
CN103488702A (en) * 2013-09-06 2014-01-01 云南电力试验研究院(集团)有限公司电力研究院 SorlCloud based unstructured data retrieval method and system
CN103701633A (en) * 2013-12-09 2014-04-02 国家电网公司 Setup and maintenance system of visual cluster application for distributed search SolrCloud
CN104035836A (en) * 2013-03-06 2014-09-10 阿里巴巴集团控股有限公司 Automatic disaster tolerance recovery method and system in cluster retrieval platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591934A (en) * 2011-12-23 2012-07-18 国网电力科学研究院 Zookeeper-based method for realizing automatic expansion and switching of multiple Solr Shards
CN104035836A (en) * 2013-03-06 2014-09-10 阿里巴巴集团控股有限公司 Automatic disaster tolerance recovery method and system in cluster retrieval platform
CN103488702A (en) * 2013-09-06 2014-01-01 云南电力试验研究院(集团)有限公司电力研究院 SorlCloud based unstructured data retrieval method and system
CN103701633A (en) * 2013-12-09 2014-04-02 国家电网公司 Setup and maintenance system of visual cluster application for distributed search SolrCloud

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JAYANT KUMAR: "《Apache Solr Search Patterns》", 30 April 2015, PACKT PUBLISHING LTD. *
KHALED NAGI: "Bringing search engines to the cloud using open source components", 《2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K)》 *
李戴维 等: "基于Solr的分布式全文检索系统的研究与实现", 《计算机与现代化》 *
李聪颖 等: "大数据分布式全文检索系统的设计与实现", 《计算机与数字工程》 *
赵璞 等: "高性能分布式搜索引擎Solr的研究与实现", 《电子科技》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107544848A (en) * 2017-08-30 2018-01-05 深圳云天励飞技术有限公司 Cluster expansion method, apparatus, electronic equipment and storage medium
CN107566531A (en) * 2017-10-17 2018-01-09 厦门市美亚柏科信息股份有限公司 A kind of Elasticsearch cluster expansion methods for supporting balanced resource
CN107566531B (en) * 2017-10-17 2020-07-10 厦门市美亚柏科信息股份有限公司 Elasticisearch cluster expansion method supporting balanced resources
CN109241085A (en) * 2018-09-20 2019-01-18 潘丽华 A kind of big data SQL query method for SolrCloud
CN109241085B (en) * 2018-09-20 2022-06-21 郴州职业技术学院 Big data SQL query method for SolrCloud
CN111124696A (en) * 2019-12-30 2020-05-08 北京三快在线科技有限公司 Unit group creation method, unit group creation device, unit group data synchronization method, unit group data synchronization device, unit and storage medium
CN111124696B (en) * 2019-12-30 2023-06-23 北京三快在线科技有限公司 Unit group creation, data synchronization method, device, unit and storage medium
CN111914022A (en) * 2020-07-23 2020-11-10 北京中数智汇科技股份有限公司 Method and device for online capacity expansion of mongodb cluster
CN111949494A (en) * 2020-09-16 2020-11-17 北京浪潮数据技术有限公司 Task regulation and control method, device and related equipment
CN111949494B (en) * 2020-09-16 2022-06-10 北京浪潮数据技术有限公司 Task regulation and control method, device and related equipment

Also Published As

Publication number Publication date
CN106648897B (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN106648897A (en) SOLR cluster extension method and system supporting resource balancing
CN107688999B (en) Block chain-based parallel transaction execution method
US20220214995A1 (en) Blockchain data archiving method, apparatus, and computer-readable storage medium
CN109885316B (en) Hdfs-hbase deployment method and device based on kubernetes
CN107566531A (en) A kind of Elasticsearch cluster expansion methods for supporting balanced resource
CN107967316A (en) A kind of method of data synchronization, equipment and computer-readable recording medium
US9009653B2 (en) Identifying quality requirements of a software product
CN112579692B (en) Data synchronization method, device, system, equipment and storage medium
CN102624865A (en) Cluster load prediction method and distributed cluster management system
CN105577763A (en) Dynamic duplicate consistency maintenance system and method, and cloud storage platform
CN106372160A (en) Distributive database and management method
CN105678118A (en) Generation method and device for software versions with digital certificate
US20150046399A1 (en) Computer system, data allocation management method, and program
US9537941B2 (en) Method and system for verifying quality of server
CN105357306A (en) Multi-platform data sharing system and data sharing method therefor
CN110928860B (en) Data migration method and device
US8584117B2 (en) Method to make SMP/E based products self describing
CN115495465A (en) Data updating method and device and electronic equipment
CN103761248B (en) The method and system of data query are carried out using memory database
Ovando-Leon et al. A simulation tool for a large-scale nosql database
CN111917826A (en) PBFT consensus algorithm based on block chain intellectual property protection
CN110851515A (en) Big data ETL model execution method and medium based on Spark distributed environment
CN103927264A (en) Method for distributing running memory space of map data of airborne digital map software
CN106155801B (en) A kind of method and resource management center of equipment transportation
US11928612B1 (en) Fixing a changing weave using a finalize node

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant