WO2017177800A1 - Solr集群自动扩容方法及系统、计算机存储介质 - Google Patents

Solr集群自动扩容方法及系统、计算机存储介质 Download PDF

Info

Publication number
WO2017177800A1
WO2017177800A1 PCT/CN2017/077557 CN2017077557W WO2017177800A1 WO 2017177800 A1 WO2017177800 A1 WO 2017177800A1 CN 2017077557 W CN2017077557 W CN 2017077557W WO 2017177800 A1 WO2017177800 A1 WO 2017177800A1
Authority
WO
WIPO (PCT)
Prior art keywords
slice
original
nodes
copies
copy
Prior art date
Application number
PCT/CN2017/077557
Other languages
English (en)
French (fr)
Inventor
王志超
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017177800A1 publication Critical patent/WO2017177800A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0876Aspects of the degree of configuration automation
    • H04L41/0886Fully automatic configuration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • H04L41/082Configuration setting characterised by the conditions triggering a change of settings the condition being updates or upgrades of network functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Definitions

  • the present invention relates to the field of distributed search engines, and in particular, to an enterprise search application server (Solr, Searching on lucene w/Replication) cluster automatic expansion method and system, and a computer storage medium.
  • Solr Searching on lucene w/Replication
  • Solr is a standalone enterprise search application server that provides a Web-service-like API (Application Programming Interface).
  • the user can submit a formatted Extensible Markup Language (XML) file to the search engine server by using a HyperText Transfer Protocol (HTTP) request to generate an index; or can perform a lookup request through an HTTP Get operation, and Get the returned result in XML format.
  • XML Extensible Markup Language
  • HTTP HyperText Transfer Protocol
  • the Solr cluster needs to be expanded.
  • the current method of capacity expansion is to manually divide and merge the target index data according to the current slice and copy of the cluster. Since it is a manual segmentation, the operation steps are very fragmented and complicated, and the slice of the slice needs to be determined by the segmentation. The accuracy of the manual segmentation of parameters such as the number of times and the size of the segmentation is very low, so there is no universally reliable and reliable expansion system in the prior art.
  • an embodiment of the present invention provides a Solr cluster automatic expansion method and system, and a computer storage medium, which solves the problem of high complexity, error-prone, and inefficient expansion due to excessive manual operation in the prior art. problem.
  • An embodiment of the present invention provides a method for automatically expanding a Solr cluster, including:
  • the first parameter includes a range in which a copy of each original slice that needs to be segmented is within a hash ring; and the determining is based on the number of the original nodes and the number of copies of each slice.
  • the first parameter of the copy of the original slice that needs to be segmented includes: determining the number of copies of the original slice that need to be segmented according to the number of the original nodes and the number of copies of each slice; The number of copies of the slice determines the extent to which the copy of the original slice that needs to be sliced is within the hash ring; wherein the extent of the copy of each original slice within the hash ring is equal.
  • determining, according to the number of original nodes and the number of copies, the copy of the original slice that needs to be segmented includes: selecting, from a copy of all the original slices, each range within the hash ring A copy of the different original slices serves as a copy of the original slice that needs to be sliced.
  • the second parameter includes: a range in which each target slice is located in a hash ring; and determining, according to the number of original nodes, the number of newly added nodes, and the number of copies, the number of each target slice
  • the two parameters include: determining the number of destination nodes according to the number of original nodes and the number of newly added nodes; determining the number of destination slices according to the number of the destination nodes and the number of copies of each slice; and the number of slices according to the destination Determine the slice of each purpose in Ukraine
  • the range within the Greek ring wherein the ranges of the respective target slices within the hash ring are equal in size.
  • the method further includes: determining the obtained number of the original nodes and the number of newly added nodes is reasonable.
  • the method further includes: deleting the original slice and corresponding data.
  • the method further includes: detecting each newly generated routing table and the data amount of each node, and determining whether the expansion is completed.
  • the embodiment of the invention further provides a Solr cluster automatic expansion system, comprising:
  • the obtaining module is configured to obtain the number of original nodes in the Solr cluster, the number of copies of each slice, and the number of newly added nodes after the Solr cluster is expanded;
  • a first analyzing module configured to determine, according to the number of the original nodes and the number of copies of each slice, a first parameter of a copy of each original slice that needs to be segmented;
  • a second analyzing module configured to determine a second parameter of each destination slice according to the number of the original node, the number of newly added nodes, and the number of copies of each slice;
  • the sharding module is configured to perform segmentation on the copy of each original slice that needs to be segmented according to the first parameter and the second parameter to obtain a current slice;
  • the merging module is configured to perform corresponding merging of the current slices according to the second parameter to obtain the target slice.
  • the first parameter includes a range in which a copy of each original slice that needs to be segmented is within a hash ring;
  • the first analysis module includes a first number of submodules and a first range submodule;
  • the first number of submodules are configured to determine a number of copies of the original slice that need to be segmented according to the number of the original nodes and the number of copies of each slice;
  • the first range submodule is configured according to the first Determining, by a number of sub-modules, the number of copies of the original slice that need to be sliced, determining a range in which the copy of the original slice that needs to be sliced is within the hash ring;
  • the copies of the original slices are equal in size within the hash ring.
  • the first number of submodules includes a selection submodule configured to select, from a copy of all the original slices, one copy of each original slice having a different range within the hash ring as needed A copy of the original slice that is sliced.
  • the second parameter includes: a range in which each target slice is within the hash ring;
  • the second analysis module includes a second number of submodules and a second range submodule, the second number The submodule is configured to determine the number of destination nodes according to the number of original nodes and the number of newly added nodes, and determine the number of destination slices according to the number of the destination nodes and the number of copies of each slice;
  • the second range The module is configured to determine, according to the number of the target slices determined by the second number of submodules, a range in which each of the destination slices is within the hash ring; wherein each of the target slices is equal in size within the hash ring .
  • the determining module further includes: after the acquiring module acquires the number of original nodes in the Solr cluster and the number of newly added nodes after the Solr cluster is expanded, determining that the obtained number of the original nodes and the number of newly added nodes are reasonable.
  • the deleting module is further configured to delete the original slice and the corresponding data after the merging module performs corresponding merging on the current slice.
  • the final confirmation module is further configured to detect, after the deletion module deletes the redundant original slice and the data, the routing table of each newly generated node and the data amount of each node, and determine whether the expansion is completed.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the Solr cluster automatic expansion method according to the embodiment of the invention.
  • the Solr cluster automatic expansion method and system and the computer storage medium determine the number of original nodes in the Solr cluster, the number of copies of each slice, and the number of newly added nodes after capacity expansion.
  • the first parameter of the copy and the second parameter of the target slice, according to which the copy of the original slice is segmented and merged into the target slice, thereby realizing the Solr cluster The expansion of the capacity; while saving the time of manual operation, the accuracy and efficiency of the expansion is improved.
  • FIG. 1 is a flowchart of a method for automatically expanding a Solr cluster according to Embodiment 1 of the present invention
  • FIG. 2 is a schematic diagram of an automatic expansion system of a Solr cluster according to Embodiment 2 of the present invention
  • FIG. 3 is a flowchart of a method for automatically expanding a Solr cluster according to Embodiment 3 of the present invention
  • FIG. 4 is a schematic diagram of an original slice, a current slice, and a target slice according to Embodiment 3 of the present invention.
  • FIG. 5 is a schematic diagram of a segmentation of a current slice according to Embodiment 4 of the present invention.
  • the idea of the present invention is to determine the range of the original slice and the range of the target slice by using the number of original nodes in the Solr cluster, the number of copies, and the number of newly added nodes after the expansion, and further determining the original slice after segmentation.
  • the number of current slices and the range in which they are located, and the copies of the original slices are segmented accordingly, and merged into target slices, thereby realizing the expansion of the enterprise search application server cluster, saving manual compared with the prior art.
  • This embodiment provides a method for automatically expanding a Solr cluster. Referring to FIG. 1, the method includes:
  • a Solr cluster in general, one node corresponds to one copy of a slice; and the number of copies of all slices is equal, that is, the number of copies of each original slice is equal, where the original slice refers to Solr If the number of original nodes in the Solr cluster is M and the number of replicas is r, the number of original slices with different ranges is M/r.
  • the number of new nodes added here is generally not It is arbitrary and needs to be determined according to the number of copies.
  • M and N are both positive integers; the number of original nodes M and the number of newly added nodes N may be the input values.
  • S102 Determine a first parameter of a copy of each original slice that needs to be segmented according to the number of original nodes and the number of copies of each original slice.
  • the number of original slices having different ranges that is, M/r can be determined; in actual operation, since each slice has a copy, only the pair is needed in the segmentation.
  • One of the copies of all the different slices can be segmented, and then a copy of the original slice having different M/r ranges can be selected from it as a copy of the original slice to be sliced.
  • One of the copies of each original slice exists as a master copy.
  • each master copy can work normally, it is preferred that the master copy of each original slice be a copy of the original slice that needs to be sliced, if some original slice If the master copy is faulty, select another copy of the original slice as a copy of the original slice that needs to be split; of course, in addition to this selection method, one of the different original slices can be directly randomly selected, or other convenient The choice of operation can be.
  • the first parameter refers to the parameter of the original slice that needs to be segmented, and is generally the range of the original slice that needs to be sliced. Further, each of the original slices that need to be sliced is within the range of the hash ring.
  • the first parameter mainly represents information such as the position and range of the original slice that needs to be sliced before the expansion. After selecting M/r copies of the original slices that need to be sliced, the range of copies of the original slices in the hash ring is also determined. Each original slice copy will be the entire hash ring Bisected, the size of the range in which the copy of each original slice is located in the hash ring is equal, so that the range in which the copy of the original slice that needs to be sliced is in the hash ring can be determined accordingly.
  • the range of the entire hash ring is 0x00-0xfffffff. All the copies of the original slice that need to be split are combined into a complete hash ring, and the range of the copy of each original slice is (0x00-0xfffffff)r/ M.
  • the target slice is the result of the segmentation of the final slice to be achieved after the Solr cluster is expanded.
  • the target slice is a slice formed by dividing the original slice, and then merged by the current slice.
  • the range of the size of each target slice is also equal. Since the number of original nodes is M, the number of newly added nodes is N, the number of target nodes is (M+N), and the number of copies of each original slice is r, and the number of copies of each slice before and after expansion It does not change, so the number of destination slices is (M+N)/r.
  • the second parameter is similar to the first parameter and refers to the parameter of the target slice.
  • the range of each target slice is in the state after the Solr cluster is expanded. Further, it is the range in which the slice of each purpose is within the hash ring.
  • the second parameter is mainly used to indicate what position the each target slice should be in the new Solr cluster after expansion, that is, the location and range size of each target slice. After determining the number of the target slices as (M+N)/r, the range occupied by each of the target slices is also determined. Each target slice bisects the entire hash ring, and the size of each target slice in the hash ring is equal.
  • the range of each target slice in the hash ring can be determined accordingly;
  • the range of the Greek ring is 0x00-0xffffffff, and the range size of each destination slice is (0x00-0xfffffff)r/(M+N).
  • the number of the current slices after the original slice is segmented and the range within the hash ring can be determined.
  • the current slice is produced by cutting a copy of the original slice that needs to be sliced, and any one is cut.
  • the extent of the slice is not larger than the range of the original slice; the number of current slices and the range of the size of each current slice are determined according to the range in which each original slice is located and the range in which the target slice is located.
  • the number of current slices is determined, and the range of each current slice is determined. At this time, the original slice is segmented according to this, so that the sliced current slice satisfies the requirement of the number of the current slice and the range of each current slice.
  • the original slice becomes the current slice; the current slice needs the second parameter of the target slice determined in step S103, or the range of the target slice in the hash ring, and the slice is sliced.
  • Corresponding merging is performed; the merging here can only be performed between the current slices adjacent to the range in which they are located; the target slices formed after merging need to satisfy the requirements of the second parameter of each target slice determined in step S103.
  • the original slice and the corresponding data need to be deleted; the original slice here mainly refers to the original slice of each different range when selecting the original slice to be sliced. A copy is selected for segmentation, and the other replicas have no effect after the expansion of the slice before expansion. In this case, it should be deleted, and a new copy of the destination slice is generated at the corresponding node.
  • each node After the expansion, each node generates a new routing table. After deleting the redundant original slices and data, you can also detect the newly generated routing table and the data volume of each node to determine whether the expansion is complete.
  • the nodes of the Solr cluster in this embodiment are generally based on a consistent hash route.
  • the characteristics of the consistent hash route are balanced, which makes the range of slices corresponding to each node consistent, and the amount of stored data is also equal.
  • This embodiment provides a Solr cluster automatic capacity expansion system. Referring to FIG. 2, the method includes:
  • the obtaining module 21 is configured to obtain the number of original nodes in the Solr cluster, the number of copies of each slice, and the number of newly added nodes after the Solr cluster is expanded;
  • the first analyzing module 22 is configured to determine, according to the number of original nodes and the number of copies, a first parameter of a copy of each original slice that needs to be segmented;
  • the second analyzing module 23 is configured to determine a second parameter of each destination slice according to the number of original nodes, the number of newly added nodes, and the number of copies of each original slice;
  • the segmentation module 25 is configured to perform segmentation on a copy of each original slice that needs to be segmented according to the first parameter and the second parameter to obtain a current slice;
  • the merging module 26 is configured to perform corresponding merging of the current slices according to the second parameter of each target slice determined by the second analyzing module 23 to obtain a target slice.
  • a Solr cluster in general, one node corresponds to one copy of a slice; and the number of copies of all slices is equal, that is, the number of copies of each original slice is equal, where the original slice refers to Solr If the number of original nodes in the Solr cluster is M and the number of replicas is r, the number of original slices with different ranges is M/r. When expanding, you need to add corresponding nodes according to requirements. The number of new nodes added here is generally not arbitrary, and needs to be determined according to the number of copies.
  • the determining module 27 is configured to determine the number M of the original nodes acquired and the new number after acquiring the number M of the original nodes in the Solr cluster and the number N of nodes added after the Solr cluster expansion.
  • N where M and N are positive integers; the number of original nodes M and the number of new nodes N can be the input values. In this case, the number of original nodes to be input needs to be judged. Whether it is consistent with the actual situation, and whether the number of newly added nodes N satisfies the above conditions; if the system automatically obtains, it is generally not necessary to judge the number M of the original nodes, but only need to judge the number of newly added nodes, and see does it reach the requirement.
  • the first parameter refers to the parameter of the original slice that needs to be segmented, and is generally the range of the original slice that needs to be sliced. Further, each of the original slices that need to be sliced is located in the hash ring. range.
  • the first parameter mainly represents information such as the position and range of the original slice that needs to be sliced before the expansion.
  • the first analysis module 22 includes a first number of sub-modules 221 and a first range of sub-modules 222; the first number of sub-modules 221 are configured to determine original slices of different ranges according to the number of original nodes and the number of copies of each original slice The number, i.e., M/r; the first range sub-module 222 determines the range of copies of each original slice within the hash ring based on the number of original slices that need to be sliced. A copy of each different original slice bisects the entire hash ring, and the size of each original slice's copy in the hash ring is equal, so that a copy of each original slice that needs to be sliced can be determined accordingly The range in the hash ring.
  • the range of the entire hash ring is 0x00-0xfffffff. All the copies of the original slice that need to be split are combined into a complete hash ring, and the range of the copy of each original slice is (0x00-0xfffffff)r/ M.
  • the first number of sub-modules 221 further includes a selection sub-module 2211. Since each slice has a copy, only one of all the different slices needs to be split when segmenting, so the selection sub-module 2211 is configured to be A copy of the original slice with different M/r ranges is selected as a copy of the original slice that needs to be sliced. One of the copies of each original slice exists as a master copy.
  • each master copy can work normally, it is preferred that the master copy of each original slice be a copy of the original slice that needs to be sliced, if some original slice If the master copy is faulty, select another copy of the original slice as a copy of the original slice that needs to be split; of course, in addition to this selection method, one of the different original slices can be directly randomly selected, or other convenient The choice of operation can be.
  • the second parameter is similar to the first parameter and refers to the parameter of the target slice.
  • the range of each target slice is in the state after the Solr cluster is expanded. Further, it is the range in which the slice of each purpose is within the hash ring.
  • the second parameter is mainly used to indicate what position the each target slice should be in the new Solr cluster after expansion, that is, the location and range size of each target slice.
  • the second analysis module 23 includes a second number sub-module 231 and a second range sub-module 232; the second number sub-module 231 is configured to determine the number of target nodes according to the number M of the original nodes and the number N of newly added nodes, that is, M+N) and according to the number of target nodes (M+N) And the number of copies r determines the number of destination slices, which is (M + N) / r.
  • the target slice is the result of the segmentation of the final slice to be achieved after the Solr cluster is expanded.
  • the target slice is the current slice formed by the original slice, and then merged by the current slice.
  • the range of the size of each target slice is also equal.
  • the second range sub-module 232 is configured to determine, according to the number of the target slices, a range in which the respective target slices are located in the hash ring; and after determining the number of the target slices as (M+N)/r, the respective target slices occupy The scope is also determined.
  • Each target slice bisects the entire hash ring, and the size of each target slice in the hash ring is equal. Therefore, the range of each target slice in the hash ring can be determined accordingly;
  • the range of the Greek ring is 0x00-0xfffffffffff, and the range size of each destination slice is (0x00-0xffffffffff)r/(M+N).
  • the original slice and the corresponding data need to be deleted;
  • the original slice here mainly refers to the original slice of each different range when selecting the original slice to be sliced.
  • a copy is selected for segmentation, and the other replicas have no effect after the expansion of the slice before expansion. In this case, it should be deleted, and a new copy of the destination slice is generated at the corresponding node. Therefore, a deletion module 28 is further included, configured to delete the original slice and the corresponding data after the current slice is synthesized.
  • each node After the expansion, each node generates a new routing table. Finally, it may further include a final confirmation module 29 configured to detect each newly generated routing table and the amount of data of each node after deleting the redundant original slice and data, and determine Whether the expansion is completed.
  • the nodes of the Solr cluster in this embodiment are generally based on a consistent hash route.
  • the characteristics of the consistent hash route are balanced, which makes the range of slices corresponding to each node consistent, and the amount of stored data is also equal.
  • the second range sub-module 232 can be used by a central processing unit (CPU), a digital signal processor (DSP), and a micro control unit (MCU, Microcontroller) in the Solr cluster automatic expansion system. Unit) or a programmable gate array (FPGA), such as a Field-Programmable Gate Array.
  • CPU central processing unit
  • DSP digital signal processor
  • MCU micro control unit
  • FPGA programmable gate array
  • This embodiment provides a method for automatically expanding a Solr cluster. Referring to FIG. 3, the method includes:
  • S303 Determine the number of original slices that need to be segmented according to the number M of original nodes and the number r of copies.
  • the number of original slices that need to be sliced is M/r.
  • S304 Determine, according to the number M/r of the original slices that need to be segmented, the range in which the original slices that need to be segmented are in the hash ring.
  • the hash ring range size of each original slice that needs to be sliced is (0x00-0xfffffffff)r/M.
  • S305 Determine the number of the target slices according to the number M of the original nodes, the number N of the newly added nodes, and the number r of the copies.
  • the number of the target slices is (M+N)/r.
  • S306. Determine, according to the number of the target slices (M+N)/r, a range in which each target slice is in the hash ring.
  • the size of the hash ring of each of the target slices is (0x00-0xffffffff)r/(M+N).
  • S307 Align the range of each original slice that needs to be segmented with the range of each target slice, and determine the number of the current slice after the original slice is segmented and the range in which it is located.
  • the comparison result of the original slice 41 and the target slice 42 is that the number of the current slices 43 is 7, and the range of each current slice 43 is respectively It is 3/5, 2/5, 1/5, 3/5, 1/5, 2/5, 3/5 of the original slice before the split, which is also 1/5 of the hash ring. 2/15, 1/15, 1/5, 1/15, 2/15, 1/5.
  • step S309 Perform corresponding merging on the current slices according to the range in which the target slice determined in step S306 is located.
  • the nodes of the Solr cluster in this embodiment are generally based on a consistent hash route.
  • the characteristics of the consistent hash route are balanced, which makes the range of slices corresponding to each node consistent, and the amount of stored data is also equal.
  • This embodiment provides a method for segmenting a copy of an original slice. Please refer to FIG. 5:
  • each of the current slices is a first current slice 531, a second current slice 532, a third current slice 533, and a fourth current slice 534, and the range of the four current slices occupying the original slice is: 2/3, 1/3, respectively. 1/3, 2/3, at the same time, the range of the entire hash ring is 1/3, 1/6, 1/6, 1/3;
  • the selected copy of the original slice that needs to be segmented is segmented
  • the sliced current slices are combined to generate a target slice 52: the second current slice 532 and the third current slice 533 are combined to form a target slice 52.
  • the first live slice 531 and the fourth live slice 534 directly serve as the target slice 52.
  • the embodiment of the invention further describes a computer storage medium, wherein the computer storage medium stores a computer program, and the computer program is used to execute the automatic expansion method of the Solr cluster shown in FIG. 1 or FIG. 3 in the embodiment of the invention.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit;
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the readable storage medium when executed, executes the steps including the above method embodiments; and the foregoing storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM) , Random Access Memory), a variety of media that can store program code, such as a disk or a disc.
  • ROM read-only memory
  • RAM random access memory
  • Random Access Memory a variety of media that can store program code, such as a disk or a disc.
  • the above-described integrated unit of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a standalone product.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a RAM, a magnetic disk, or an optical disk.
  • the technical solution of the embodiment of the present invention determines the first parameter and the target of the original slice by obtaining the number of original nodes in the enterprise search application server cluster, the number of copies of each slice, and the number of newly added nodes after the expansion.
  • the second parameter of the slice is used to segment the copy of the original slice and merge it into the target slice, thereby realizing the expansion of the enterprise search application server cluster; saving the manual operation time and improving the accuracy of the expansion. And efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Automation & Control Theory (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种企业级搜索应用服务器集群自动扩容方法及系统、计算机存储介质,其中,所述方法包括:通过获取企业级搜索应用服务器集群中原节点的数目、每个切片的副本的数目,以及扩容后新增节点的数目,确定出原切片的副本的第一参数和目标切片的第二参数,根据所述第一参数和所述第二参数,对原切片的副本进行切分,根据所述第二参数对所述现切片进行相应的合并,得到目标切片。

Description

Solr集群自动扩容方法及系统、计算机存储介质 技术领域
本发明涉及分布式搜索引擎领域,尤其涉及一种企业级搜索应用服务器(Solr,Searching on lucene w/Replication)集群自动扩容方法及系统、计算机存储介质。
背景技术
Solr是一个独立的企业级搜索应用服务器,它对外提供类似Web-service的应用程序编程接口(API,Application Programming Interface)。用户可以通过超文本传输协议(HTTP,HyperText Transfer Protocol)请求,向搜索引擎服务器提交一定格式的可扩展标记语言(XML,Extensible MarkupLanguage)文件,生成索引;也可以通过HTTP Get操作提出查找请求,并得到XML格式的返回结果。
随着索引数量越来越大,搜索响应时间则会变得越来越长,索引新内容的速度也会越来越慢,这对检索是大大不利的,因此,需要对Solr集群进行扩容。
而目前扩容的方法仅仅是根据当前集群的切片及副本情况进行手动切分、合并目标索引数据,由于是手动切分,其操作步骤十分零散且复杂,而且切分所需要确定的切片的切分次数和切分大小等参数手动切分的准切度十分低,因此现有技术中并无一种普适性的可靠的扩容系统。
发明内容
为了解决上述技术问题,本发明实施例提供了一种Solr集群自动扩容方法及系统、计算机存储介质,解决了现有技术中由于手动操作过多而导致扩容复杂度高且易出错、效率低下的问题。
本发明实施例的技术方案是这样实现的:
本发明实施例提供了一种Solr集群自动扩容方法,包括:
获取Solr集群中原节点的数目、每个切片的副本的数目,以及所述Solr集群扩容后新增节点的数目;
根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的各个原切片的副本的第一参数;
根据所述原节点的数目、新增节点的数目以及每个切片的副本的数目,确定各个目的切片的第二参数;
根据所述第一参数和所述第二参数,对所述需要切分的各个原切片的副本进行切分,得到现切片;
根据所述第二参数对所述现切片进行相应的合并,得到目的切片。
作为一种实施方式,所述第一参数包括需要切分的各个原切片的副本在哈希环内所处的范围;所述根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的第一参数包括:根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的数目;根据所述需要切分的原切片的副本的数目,确定需要切分的原切片的副本在哈希环内所处的范围;其中,各个原切片的副本在所述哈希环内所处的范围的大小相等。
作为一种实施方式,所述根据原节点的数目以及副本的数目,确定需要切分的原切片的副本包括:从所有原切片的副本中,选出每个在哈希环内所处的范围不同的原切片的一个副本作为需要切分的原切片的副本。
作为一种实施方式,所述第二参数包括:各个目的切片在哈希环内所处的范围;所述根据原节点的数目、新增节点的数目以及副本的数目,确定各个目的切片的第二参数包括:根据原节点的数目、新增节点的数目,确定目的节点的数目;根据所述目的节点的数目以及每个切片的副本的数目确定目的切片的数目;根据所述目的切片的数目确定各个目的切片在哈 希环内所处的范围;其中,各个目的切片在所述哈希环内所处的范围的大小相等。
作为一种实施方式,在所述获取Solr集群中原节点的数目以及Solr集群扩容后新增节点的数目之后,还包括:确定获取的所述原节点数、新增节点数是合理的。
作为一种实施方式,在对所述现切片进行相应的合并之后,还包括:删除原切片以及对应的数据。
作为一种实施方式,在所述删除原切片以及数据之后,还包括:检测各个新生成的路由表以及各个节点的数据量,判断扩容是否完成。
本发明实施例还提供了一种Solr集群自动扩容系统,包括:
获取模块,配置为获取Solr集群中原节点的数目、每个切片的副本的数目,以及Solr集群扩容后新增节点的数目;
第一分析模块,配置为根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的各个原切片的副本的第一参数;
第二分析模块,配置为根据所述原节点的数目、新增节点的数目以及每个切片的副本的数目,确定各个目的切片的第二参数;
切分模块,配置为根据所述第一参数和第二参数,对所述需要切分的各个原切片的副本进行切分,得到现切片;
合并模块,配置为根据所述第二参数对所述现切片进行相应的合并,得到所述目的切片。
作为一种实施方式,所述第一参数包括需要切分的各个原切片的副本在哈希环内所处的范围;所述第一分析模块包括第一数目子模块和第一范围子模块;所述第一数目子模块配置为根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的数目;所述第一范围子模块配置为根据所述第一数目子模块确定的所述需要切分的原切片的副本的数目,确定需要切分的原切片的副本在哈希环内所处的范围;其中,各 个原切片的副本在所述哈希环内所处的范围的大小相等。
作为一种实施方式,所述第一数目子模块包括选择子模块,配置为从所有原切片的副本中,选出每个在哈希环内所处的范围不同的原切片的一个副本作为需要切分的原切片的副本。
作为一种实施方式,所述第二参数包括:各个目的切片在哈希环内所处的范围;所述第二分析模块包括第二数目子模块和第二范围子模块,所述第二数目子模块配置为根据原节点的数目、新增节点的数目,确定目的节点的数目,并根据所述目的节点的数目以及每个切片的副本的数目确定目的切片的数目;所述第二范围子模块配置为根据所述第二数目子模块确定的所述目的切片的数目确定各个目的切片在哈希环内所处的范围;其中,各个目的切片在哈希环内所处的范围的大小相等。
作为一种实施方式,还包括确定模块,配置为所述获取模块获取Solr集群中原节点的数目以及Solr集群扩容后新增节点的数目之后,确定获取的所述原节点数、新增节点数是合理的。
作为一种实施方式,还包括删除模块,配置为所述合并模块对现切片进行相应的合并之后,删除原切片以及对应的数据。
作为一种实施方式,还包括最终确认模块,配置为在所述删除模块删除冗余的原切片以及数据之后,检测各个新生成的节点的路由表以及各个节点的数据量,判断扩容是否完成。
本发明实施例还提出了一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例所述的Solr集群自动扩容方法。
本发明实施例所述的Solr集群自动扩容方法及系统、计算机存储介质,通过获取Solr集群中原节点的数目、每个切片的副本的数目,以及扩容后新增节点的数目,确定出原切片的副本的第一参数和目标切片的第二参数,据此对原切片的副本进行切分,再合并为目标切片,从而实现了Solr集群 的扩容;节省了手动操作的时间的同时,提高了扩容的准确率和效率。
附图说明
图1为本发明实施例一提供的一种Solr集群的自动扩容方法流程图;
图2为本发明实施例二提供的一种Solr集群的自动扩容系统示意图;
图3为本发明实施例三提供的一种Solr集群的自动扩容方法流程图;
图4为本发明实施例三提供的原切片、现切片、目的切片示意图;
图5为本发明实施例四提供的现切片的切分示意图。
具体实施方式
本发明的构思在于:利用Solr集群中原节点的数目、副本的数目,以及扩容后新增节点的数目,确定出原切片的副本的范围和目标切片的范围,据此进一步确定原切片切分后的现切片的数目和所处的范围,并据此对原切片的副本进行切分,在合并为目标切片,从而实现了企业级搜索应用服务器集群的扩容,与现有技术相比节省了手动操作的时间且提高了准确率。
下面结合附图对本发明的实施方式作进一步说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
实施例一
本实施例提供了一种Solr集群的自动扩容方法,请参考图1,包括:
S101、获取Solr集群中原节点的数目、每个原切片的副本的数目,以及Solr集群扩容后新增节点的数目。
在Solr集群中,通常而言,一个节点对应着一个切片的一个副本;而所有的切片的副本的数目均是相等的,即各个原切片的副本数是相等的,其中原切片是指在Solr集群扩容前存在的切片;若Solr集群中原节点的数目是M,副本的数目为r,则各个范围不同的原切片的数目即是M/r;在进行扩容时,需要根据需求增加相应的节点,这里新增的节点数一般而言不 是任意的,需要根据副本的数目而定,如原切片的副本的数目r=1,则新增节点的数目只要为大于1的正整数即可;若原切片的副本的数目r=2,则需要新增节点的数目为2的倍数,以此类推,新增节点的数目应该是副本的数目r的倍数。为了保证扩容操作的可行性,在获取Solr集群中原节点的数目M和Solr集群扩容后新增节点的数目N之后,可以判断获取到的原节点的数目M和新增节点的数目N是否是合理的,其中M、N均是正整数;这里的原节点的数目M和新增节点的数目N,可以是输入的数值,此时需要判断输入的原节点的数目M和实际情况是否符合,以及新增节点数N是否满足以上条件;若是系统自动获取的,则一般不需对原节点的数目M进行判断,而仅需对新增的节点数进行判断,看其是否符合要求。
S102、根据原节点的数目以及每个原切片的副本的数目,确定需要切分的各个原切片的副本的第一参数。
根据原节点的数目以及每个原切片的副本的数目,可以确定范围不同的原切片的数目,即M/r;在实际操作时,由于各个切片均有副本,而在切分时仅仅需要对所有不同切片的其中一个副本进行切分即可,则此时可以从中选择这M/r个范围不同的原切片的副本,作为需要切分的原切片的副本。各个原切片的副本中,有一个是作为主副本而存在的,在各个主副本能够正常工作的前提下,优选各个原切片的主副本作为需要切分的原切片的副本,若是某些原切片的主副本有故障,则选择该原切片的其他副本作为需要切分的原切片的副本;当然,除去这种选择方式,还可以直接随机选取所有不同的原切片中的一个副本,或者其他便于操作的选择方式均可。
第一参数,是指需要切分的原切片的参数,一般是各个需要切分的原切片的所处的范围。进一步说,各个需要切分的原切片在哈希环内所处的范围。第一参数主要表征的是在扩容前,各个需要切分的原切片的位置、范围大小等信息。选择了M/r个需要切分的原切片的副本后,各个原切片的副本在哈希环内所处的范围也就确定了。各个原切片副本将整个哈希环 平分,每个原切片的副本在哈希环中所处的范围的大小是相等的,因此,可以据此确定各个需要切分的原切片的副本在哈希环中所处的范围。整个哈希环的范围是0x00-0xffffffff,所有的需要切分的原切片的副本组合在一起即是一个完整的哈希环,则各个原切片的副本的范围大小为(0x00-0xffffffff)r/M。
S103、根据原节点的数目、新增节点的数目以及每个原切片的副本的数目,确定各个目的切片的第二参数。目的切片是Solr集群扩容后所要达到的最终切片的切分结果,目的切片是由原切片的副本切分后形成的现切片,再由现切片进行相应的合并而成。各个目的切片所占的范围大小同样是相等的。由于原节点的数目是M,新增的节点数目是N,则目标节点的数目则是(M+N),而每个原切片的副本的数目是r,在扩容前后各个切片的副本的数目不变,因此目的切片的数目即是(M+N)/r。
第二参数与第一参数类似,是指目的切片的参数,一般是Solr集群扩容后的状态下,各个目的切片所处的范围。进一步说,是各个目的切片在哈希环内所处的范围。第二参数主要表征的是,在扩容后,各个目的切片在新的Solr集群中应该处于什么样的一个位置,即各个目的切片的位置、范围大小等信息。确定目的切片的数目为(M+N)/r之后,各个目的切片所占的范围也就确定了。各个目的切片将整个哈希环平分,每个目的切片在哈希环中所占的范围大小是相等的,因此,可以据此确定各个目的切片在哈希环中所处的范围;由于整个哈希环的范围是0x00-0xffffffff,则各个目的切片的范围大小为(0x00-0xffffffff)r/(M+N)。
S104、根据第一参数和第二参数,对需要切分的各个原切片的副本进行切分,得到现切片。
根据第一参数和第二参数,可以确定原切片切分后的现切片的数目及其在哈希环内所处的范围。
现切片是由需要切分的原切片的副本切分后产生的,且任何一个现切 片的范围不会大于原切片的范围;现切片的数目、各个现切片的范围大小根据各个原切片所处的范围以及目的切片所处的范围所确定。
确定了现切片的数目,各个现切片所处的范围,则此时据此对原切片进行切分,使切分后的现切片满足现切片的数目和各个现切片的范围的要求。
S105、根据第二参数对现切片进行相应的合并,得到目的切片。
在对原切片进行切分后,原切片变为了现切片;现切片需要在步骤S103中确定出的目的切片的第二参数,或者说目的切片在哈希环内所处的范围,将现切片进行相应的合并;这里的合并一般只能在所处的范围相邻的现切片之间进行;合并后所形成的目的切片需满足步骤S103中所确定的各个目的切片的第二参数的要求。
在将现切片进行合并形成了目的切片之后,还需要将原切片和对应的数据删除;这里的原切片,主要是指在选择需要切分的原切片时,每个不同的范围的原切片仅选取了一个副本进行切分,那么其他的副本作为扩容前的切片在扩容后已经没有作用,此时应该将其删除,在相应的节点生成新的目的切片的副本。
扩容后,各个节点会生成新的路由表;在删除冗余的原切片以及数据之后,还可以检测各个新生成的路由表以及各个节点的数据量,判断扩容是否完成。
本实施例中的Solr集群的节点一般是基于一致性哈希路由,一致性哈希路由的特点是具有平衡性,其使得各个节点对应的切片的范围一致,存储的数据量也相等。
实施例二
本实施例提供了一种Solr集群自动扩容系统,请参考图2,包括:
获取模块21,配置为获取Solr集群中原节点的数目、每个切片的副本的数目,以及Solr集群扩容后新增节点的数目;
第一分析模块22,配置为根据原节点的数目以及副本的数目,确定需要切分的各个原切片的副本的第一参数;
第二分析模块23,配置为根据原节点的数目、新增节点的数目以及每个原切片的副本的数目,确定各个目的切片的第二参数;
切分模块25,配置为根据第一参数和第二参数,对需要切分的各个原切片的副本进行切分,得到现切片;
合并模块26,配置为根据第二分析模块23确定的各个目的切片的第二参数,对现切片进行相应的合并,得到目的切片。
在Solr集群中,通常而言,一个节点对应着一个切片的一个副本;而所有的切片的副本的数目均是相等的,即各个原切片的副本数是相等的,其中原切片是指在Solr集群扩容前存在的切片;若Solr集群中原节点的数目是M,副本的数目为r,则各个范围不同的原切片的数目即是M/r;再进行扩容时,需要根据需求增加相应的节点,这里新增的节点数一般而言不是任意的,需要根据副本的数目而定,如原切片的副本的数目r=1,则新增节点的数目只要为大于1的正整数即可;若原切片的副本的数目r=2,则需要新增节点的数目为2的倍数,以此类推,新增节点的数目应该是副本的数目r的倍数。为了保证扩容操作的可行性,还包括确定模块27,配置为在获取Solr集群中原节点的数目M和Solr集群扩容后新增节点的数目N之后,可以判断获取到的原节点的数目M和新增节点的数目N是否是合理的,其中M、N均是正整数;这里的原节点的数目M和新增节点的数目N,可以是输入的数值,此时需要判断输入的原节点的数目M和实际情况是否符合,以及新增节点数N是否满足以上条件;若是系统自动获取的,则一般不需对原节点的数目M进行判断,而仅需对新增的节点数进行判断,看其是否符合要求。
第一参数,是指需要切分的原切片的参数,一般是各个需要切分的原切片的所处的范围。进一步说,各个需要切分的原切片在哈希环内所处的 范围。第一参数主要表征的是在扩容前,各个需要切分的原切片的位置、范围大小等信息。第一分析模块22包括第一数目子模块221和第一范围子模块222;第一数目子模块221配置为根据原节点的数目以及每个原切片的副本的数目,可以确定范围不同的原切片的数目,即M/r;第一范围子模块222则根据这需要切分的原切片的数目,确定各个原切片的副本在哈希环内所处的范围。各个不同的原切片的副本将整个哈希环平分,每个原切片的副本在哈希环中所处的范围的大小是相等的,因此,可以据此确定各个需要切分的原切片的副本在哈希环中所处的范围。整个哈希环的范围是0x00-0xffffffff,所有的需要切分的原切片的副本组合在一起即是一个完整的哈希环,则各个原切片的副本的范围大小为(0x00-0xffffffff)r/M。
第一数目子模块221还包括选择子模块2211,由于各个切片均有副本,而在切分时仅仅需要对所有不同切片的其中一个副本进行切分即可,因此选择子模块2211则配置为从中选择这M/r个范围不同的原切片的副本,作为需要切分的原切片的副本。各个原切片的副本中,有一个是作为主副本而存在的,在各个主副本能够正常工作的前提下,优选各个原切片的主副本作为需要切分的原切片的副本,若是某些原切片的主副本有故障,则选择该原切片的其他副本作为需要切分的原切片的副本;当然,除去这种选择方式,还可以直接随机选取所有不同的原切片中的一个副本,或者其他便于操作的选择方式均可。
第二参数与第一参数类似,是指目的切片的参数,一般是Solr集群扩容后的状态下,各个目的切片所处的范围。进一步说,是各个目的切片在哈希环内所处的范围。第二参数主要表征的是,在扩容后,各个目的切片在新的Solr集群中应该处于什么样的一个位置,即各个目的切片的位置、范围大小等信息。第二分析模块23包括第二数目子模块231和第二范围子模块232;第二数目子模块231配置为根据原节点的数目M、新增节点的数目N确定目标节点的数目,即是(M+N),并根据目标节点的数目(M+N)以 及副本的数目r确定目的切片的数目,即是(M+N)/r。目的切片是Solr集群扩容后所要达到的最终切片的切分结果,目的切片是由原切片切分后形成的现切片,再由现切片进行相应的合并而成。各个目的切片所占的范围大小同样是相等的。
第二范围子模块232则配置为根据该目的切片的数目确定各个目的切片在哈希环内所处的范围;确定目的切片的数目为(M+N)/r之后,各个目的切片所占的范围也就确定了。各个目的切片将整个哈希环平分,每个目的切片在哈希环中所占的范围大小是相等的,因此,可以据此确定各个目的切片在哈希环中所处的范围;由于整个哈希环的范围是0x00-0xffffffff,则各个目的切片的范围大小为(0x00-0xffffffff)r/(M+N)。
在将现切片进行合并形成了目的切片之后,还需要将原切片和对应的数据删除;这里的原切片,主要是指在选择需要切分的原切片时,每个不同的范围的原切片仅选取了一个副本进行切分,那么其他的副本作为扩容前的切片在扩容后已经没有作用,此时应该将其删除,在相应的节点生成新的目的切片的副本。因此,还包括删除模块28,配置为在将现切片进行合成之后,删除原切片以及对应的数据。
扩容后,各个节点会生成新的路由表;最后,还可以包括最终确认模块29,配置为在删除冗余的原切片以及数据之后,检测各个新生成的路由表以及各个节点的数据量,判断扩容是否完成。
本实施例中的Solr集群的节点一般是基于一致性哈希路由,一致性哈希路由的特点是具有平衡性,其使得各个节点对应的切片的范围一致,存储的数据量也相等。
本发明实施例中,所述Solr集群自动扩容系统中的获取模块21、第一分析模块22、第二分析模块23、切分模块25、合并模块26、确定模块27、删除模块28、最终确认模块29,第一分析模块22中的第一数目子模块221和第一范围子模块222,第二分析模块23包括的第二数目子模块231和第 二范围子模块232,在实际应用中均可由所述Solr集群自动扩容系统中的中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,DigitalSignal Processor)、微控制单元(MCU,Microcontroller Unit)或可编程门阵列(FPGA,Field-Programmable Gate Array)等实现。
实施例三
本实施例提供了一种Solr集群自动扩容方法,请参考图3,包括:
S301、获取Solr集群中原节点的数目M、副本的数目r,以及Solr集群扩容后新增节点的数目N。
S302、判断原节点的数目M和新增节点的数目N是否合理。这里的合理是指原节点的数目M和新增节点的数目N均是正整数,且新增节点的数目N应该是副本的数目r的倍数。
S303、根据原节点的数目M和副本的数目r,确定需要切分的原切片的数目。其中需要切分的原切片的数目为M/r。
S304、根据需要切分的原切片的数目M/r,确定需要切分的各个原切片在哈希环中所处的范围。其中需要切分的各个原切片的哈希环范围大小为(0x00-0xffffffff)r/M。
S305、根据原节点的数目M、新增节点的数目N以及副本的数目r,确定目的切片的数目。其中目的切片的数目为(M+N)/r。
S306、根据目的切片的数目(M+N)/r,确定各个目的切片在哈希环中所处的范围。其中各个目的切片的哈希环范围大小为(0x00-0xffffffff)r/(M+N)。
S307、比对需要切分的各个原切片所处的范围与各个目的切片所处的范围,确定原切片切分后的现切片的数目及其所处的范围。
例如,请参考图4,当M=3,r=1,N=2时,原切片41和目的切片42的比对结果是现切片43的数目为7,且各个现切片43的范围大小分别为切分前的原切片的3/5、2/5、1/5、3/5、1/5、2/5、3/5,也分别是哈希环的1/5、 2/15、1/15、1/5、1/15、2/15、1/5。
S308、根据步骤S307中确定的现切片的数目和所处的范围,对原切片进行相应的切分。
S309、根据步骤S306中确定的目的切片所处的范围,对现切片进行相应的合并。
S310、将冗余的原切片和数据删除。
本实施例中的Solr集群的节点一般是基于一致性哈希路由,一致性哈希路由的特点是具有平衡性,其使得各个节点对应的切片的范围一致,存储的数据量也相等。
实施例四
本实施例提供了一种原切片的副本的切分方法,请参考图5:
在本实施例中,Solr集群中原节点的数目M=4,每个原切片的副本的数目r=2,而Solr集群扩容后新增的节点数为N=2;
根据原节点的数目M=4以及每个原切片的副本的数目r=2可以确定,范围不同的原切片的数目是M/r=2,分别是第一原切片511、第二原切片512,也即是说,哈希环被第一原切片511和第二原切片512均分为了两份,各个原切片的副本在哈希环中所占的范围均是整个哈希环的1/2;
从两个第一原切片511的副本和第二原切片512的副本中,分别选择一个作为需要切分的原切片的副本;
根据原节点的数目M=4、新增节点的数目N=2,可以确定目的节点的数目是M+N=6;因此,进一步的,目的切片52的数目是(M+N)/r=3,即三个目的切片52,各个目的切片52在哈希环内所占的范围均是1/3;
比对原切片的副本在哈希环内所占的范围情况以及目的切片在哈希环内所占范围情况,可以确定出,对原切片的副本进行切分后的现切片的数目是4,各个现切片分别是第一现切片531、第二现切片532、第三现切片533、第四现切片534,四个现切片占原切片的范围大小分别是:2/3、1/3、 1/3、2/3,同时,在整个哈希环中所占的范围大小分别为1/3、1/6、1/6、1/3;
根据上述四个现切片的范围大小,对选定的需要切分的原切片的副本进行切分;
根据目的切片52在哈希环中所占的范围,对切分后的现切片进行相应的合并,生成目的切片52:将第二现切片532和第三现切片533进行合并,形成目的切片52;第一现切片531和第四现切片534直接作为目的切片52。
本发明实施例还记载一种计算机存储介质,所述计算机存储介质中存储有计算机程序,所述计算机程序用于执行本发明实施例中图1或图3所示的Solr集群的自动扩容方法。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机 可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-OnlyMemory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容是结合具体的实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。
工业实用性
本发明实施例的技术方案,通过获取企业级搜索应用服务器集群中原节点的数目、每个切片的副本的数目,以及扩容后新增节点的数目,确定出原切片的副本的第一参数和目标切片的第二参数,据此对原切片的副本进行切分,再合并为目标切片,从而实现了企业级搜索应用服务器集群的扩容;节省了手动操作的时间的同时,提高了扩容的准确率和效率。

Claims (15)

  1. 一种企业级搜索应用服务器集群自动扩容方法,包括:
    获取企业级搜索应用服务器集群中原节点的数目、每个切片的副本的数目,以及所述企业级搜索应用服务器集群扩容后新增节点的数目;
    根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的各个原切片的副本的第一参数;
    根据所述原节点的数目、每个切片的副本的数目以及新增节点的数目,确定各个目的切片的第二参数;
    根据所述第一参数和所述第二参数,对所述需要切分的各个原切片的副本进行切分,得到现切片;
    根据所述第二参数对所述现切片进行相应的合并,得到目的切片。
  2. 如权利要求1所述的企业级搜索应用服务器集群自动扩容方法,其中,所述第一参数包括需要切分的各个原切片的副本在哈希环内所处的范围;
    所述根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的第一参数包括:
    根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的数目;根据所述需要切分的原切片的副本的数目,确定需要切分的原切片的副本在哈希环内所处的范围;其中,各个原切片的副本在所述哈希环内所处的范围的大小相等。
  3. 如权利要求2所述的企业级搜索应用服务器集群自动扩容方法,其中,所述根据原节点的数目以及副本的数目,确定需要切分的原切片的副本包括:
    从所有原切片的副本中,选出每个在哈希环内所处的范围不同的原切片的一个副本作为需要切分的原切片的副本。
  4. 如权利要求1所述的企业级搜索应用服务器集群自动扩容方法,其中,所述第二参数包括:各个目的切片在哈希环内所处的范围;
    所述根据原节点的数目、新增节点的数目以及副本的数目,确定各个目的切片的第二参数包括:
    根据原节点的数目、新增节点的数目,确定目的节点的数目;根据所述目的节点的数目以及每个切片的副本的数目确定目的切片的数目;根据所述目的切片的数目确定各个目的切片在哈希环内所处的范围;其中,各个目的切片在哈希环内所处的范围的大小相等。
  5. 如权利要求1所述的企业级搜索应用服务器集群自动扩容方法,其中,在所述获取企业级搜索应用服务器集群中原节点的数目以及企业级搜索应用服务器集群扩容后新增节点的数目之后,还包括:确定获取的所述原节点数、新增节点数是合理的。
  6. 如权利要求1至5任一项所述的企业级搜索应用服务器集群自动扩容方法,其中,在对所述现切片进行相应的合并之后,还包括:删除原切片以及对应的数据。
  7. 如权利要求6所述的企业级搜索应用服务器集群自动扩容方法,其中,在所述删除原切片以及数据之后,还包括:检测各个新生成的路由表以及各个节点的数据量,判断扩容是否完成。
  8. 一种企业级搜索应用服务器集群自动扩容系统,包括:
    获取模块,配置为获取企业级搜索应用服务器集群中原节点的数目、每个切片的副本的数目,以及企业级搜索应用服务器集群扩容后新增节点的数目;
    第一分析模块,配置为根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的各个原切片的副本的第一参数;
    第二分析模块,配置为根据所述原节点的数目、新增节点的数目以及每个切片的副本的数目,确定各个目的切片的第二参数;
    切分模块,配置为根据所述第一参数和所述第二参数,对所述需要切分的各个原切片的副本进行切分,得到现切片;
    合并模块,配置为根据所述第二参数对所述现切片进行相应的合并,得到所述目的切片。
  9. 如权利要求8所述的企业级搜索应用服务器集群自动扩容系统,其中,所述第一参数包括需要切分的各个原切片的副本在哈希环内所处的范围;
    所述第一分析模块包括第一数目子模块和第一范围子模块;
    所述第一数目子模块配置为根据所述原节点的数目以及每个切片的副本的数目,确定需要切分的原切片的副本的数目;所述第一范围子模块配置为根据所述第一数目子模块确定的所述需要切分的原切片的副本的数目,确定需要切分的原切片的副本在哈希环内所处的范围;其中,各个原切片的副本在所述哈希环内所处的范围的大小相等。
  10. 如权利要求9所述的企业级搜索应用服务器集群自动扩容系统,其中,所述第一数目子模块包括选择子模块,配置为从所有原切片的副本中,选出每个在哈希环内所处的范围不同的原切片的一个副本作为需要切分的原切片的副本。
  11. 如权利要求8所述的企业级搜索应用服务器集群自动扩容系统,其中,所述第二参数包括:各个目的切片在哈希环内所处的范围;
    所述第二分析模块包括第二数目子模块和第二范围子模块,
    所述第二数目子模块配置为根据原节点的数目、新增节点的数目,确定目的节点的数目,并根据所述目的节点的数目以及每个切片的副本的数目确定目的切片的数目;所述第二范围子模块配置为根据所述第二数目子模块确定的所述目的切片的数目确定各个目的切片在哈希环内所处的范围;其中,各个目的切片在哈希环内所处的范围的大小相等。
  12. 如权利要求8所述的企业级搜索应用服务器集群自动扩容系统, 其中,还包括确定模块,配置为所述获取模块获取企业级搜索应用服务器集群中原节点的数目以及企业级搜索应用服务器集群扩容后新增节点的数目之后,确定获取的所述原节点数、新增节点数是合理的。
  13. 如权利要求8至12任一项所述的企业级搜索应用服务器集群自动扩容系统,其中,还包括删除模块,配置为所述合并模块对现切片进行相应的合并之后,删除原切片以及对应的数据。
  14. 如权利要求13所述的企业级搜索应用服务器集群自动扩容系统,其中,还包括最终确认模块,配置为在所述删除模块删除原切片以及数据之后,检测各个新生成的节点的路由表以及各个节点的数据量,判断扩容是否完成。
  15. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至7任一项所述的企业级搜索应用服务器集群自动扩容方法。
PCT/CN2017/077557 2016-04-15 2017-03-21 Solr集群自动扩容方法及系统、计算机存储介质 WO2017177800A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610234871.9 2016-04-15
CN201610234871.9A CN107302444B (zh) 2016-04-15 2016-04-15 企业级搜索应用服务器集群自动扩容方法及装置

Publications (1)

Publication Number Publication Date
WO2017177800A1 true WO2017177800A1 (zh) 2017-10-19

Family

ID=60041367

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/077557 WO2017177800A1 (zh) 2016-04-15 2017-03-21 Solr集群自动扩容方法及系统、计算机存储介质

Country Status (2)

Country Link
CN (1) CN107302444B (zh)
WO (1) WO2017177800A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009236A (zh) * 2017-11-29 2018-05-08 北京锐安科技有限公司 一种大数据查询方法、系统、计算机及存储介质
CN111125139A (zh) * 2019-12-26 2020-05-08 北京浪潮数据技术有限公司 一种多控制器的任务处理方法及相关装置
CN116132289A (zh) * 2022-09-27 2023-05-16 马上消费金融股份有限公司 信息配置方法、装置、设备和介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111371583B (zh) * 2018-12-26 2022-09-23 中兴通讯股份有限公司 服务器的扩容方法及装置、服务器、存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521297A (zh) * 2011-11-30 2012-06-27 北京人大金仓信息技术股份有限公司 无共享数据库集群中实现系统动态扩展的方法
CN102591934A (zh) * 2011-12-23 2012-07-18 国网电力科学研究院 一种基于Zookeeper实现多个Solr Shards自动扩展与切换的方法
CN103488702A (zh) * 2013-09-06 2014-01-01 云南电力试验研究院(集团)有限公司电力研究院 基于SorlCloud非结构化数据检索方法和系统
CN104156367A (zh) * 2013-05-14 2014-11-19 阿里巴巴集团控股有限公司 一种搜索引擎的扩容方法及搜索服务系统
US9171009B1 (en) * 2013-06-21 2015-10-27 Emc Corporation Cluster file system comprising storage server units each having a scale-out network attached storage cluster

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100920714B1 (ko) * 2009-03-06 2009-10-14 (주)도울정보기술 조명 단말노드의 확장이 용이한 원격 조명등 감시제어 시스템
CN103984607A (zh) * 2013-02-08 2014-08-13 华为技术有限公司 分布式存储的方法、装置和系统
CN104035836B (zh) * 2013-03-06 2018-01-02 阿里巴巴集团控股有限公司 集群检索平台中的自动容灾恢复方法及系统
CN103647797A (zh) * 2013-11-15 2014-03-19 北京邮电大学 一种分布式文件系统及其数据访问方法
CN104050102B (zh) * 2014-06-26 2017-09-08 北京思特奇信息技术股份有限公司 一种电信系统中的对象存储方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521297A (zh) * 2011-11-30 2012-06-27 北京人大金仓信息技术股份有限公司 无共享数据库集群中实现系统动态扩展的方法
CN102591934A (zh) * 2011-12-23 2012-07-18 国网电力科学研究院 一种基于Zookeeper实现多个Solr Shards自动扩展与切换的方法
CN104156367A (zh) * 2013-05-14 2014-11-19 阿里巴巴集团控股有限公司 一种搜索引擎的扩容方法及搜索服务系统
US9171009B1 (en) * 2013-06-21 2015-10-27 Emc Corporation Cluster file system comprising storage server units each having a scale-out network attached storage cluster
CN103488702A (zh) * 2013-09-06 2014-01-01 云南电力试验研究院(集团)有限公司电力研究院 基于SorlCloud非结构化数据检索方法和系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009236A (zh) * 2017-11-29 2018-05-08 北京锐安科技有限公司 一种大数据查询方法、系统、计算机及存储介质
CN111125139A (zh) * 2019-12-26 2020-05-08 北京浪潮数据技术有限公司 一种多控制器的任务处理方法及相关装置
CN111125139B (zh) * 2019-12-26 2022-04-22 北京浪潮数据技术有限公司 一种多控制器的任务处理方法及相关装置
CN116132289A (zh) * 2022-09-27 2023-05-16 马上消费金融股份有限公司 信息配置方法、装置、设备和介质

Also Published As

Publication number Publication date
CN107302444A (zh) 2017-10-27
CN107302444B (zh) 2022-03-25

Similar Documents

Publication Publication Date Title
US11734125B2 (en) Tiered cloud storage for different availability and performance requirements
KR101974288B1 (ko) 공유 폴더 및 파일의 동기화
CN107609186B (zh) 信息处理方法及装置、终端设备及计算机可读存储介质
WO2017177800A1 (zh) Solr集群自动扩容方法及系统、计算机存储介质
US10534547B2 (en) Consistent transition from asynchronous to synchronous replication in hash-based storage systems
JP5759915B2 (ja) ファイルリスト生成方法及びシステム並びにプログラム、ファイルリスト生成装置
KR20170128297A (ko) 필터링 데이터 계통 다이어그램
CN109542911B (zh) 一种元数据组织方法、系统、设备及计算机可读存储介质
US10241963B2 (en) Hash-based synchronization of geospatial vector features
WO2014000458A1 (zh) 小文件处理方法及装置
TW201837749A (zh) 基於社交網路的群組查找方法和裝置
JP6154960B2 (ja) ファイルをスキャンする方法及び装置
WO2023165272A1 (zh) 数据存储及查询
CN113342741B (zh) 快照实现方法及装置、电子设备及计算机可读存储介质
CN114282071A (zh) 基于图数据库的请求处理方法、装置、设备及存储介质
EP4012573A1 (en) Graph reconstruction method and apparatus
US9529855B2 (en) Systems and methods for point of interest data ingestion
CN110765073B (zh) 分布式存储系统的文件管理方法、介质、设备及装置
US10083121B2 (en) Storage system and storage method
US8943019B1 (en) Lookup optimization during online file system migration
US20150347402A1 (en) System and method for enabling a client system to generate file system operations on a file system data set using a virtual namespace
CN111782634B (zh) 数据分布式存储方法、装置、电子设备及存储介质
CN114897666A (zh) 图数据存储、访问、处理方法、训练方法、设备及介质
US20170180511A1 (en) Method, system and apparatus for dynamic detection and propagation of data clusters
JP6197666B2 (ja) 記憶装置、複製方法及び複製プログラム

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17781776

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17781776

Country of ref document: EP

Kind code of ref document: A1