CN103310023A - Distributed searching system and method - Google Patents

Distributed searching system and method Download PDF

Info

Publication number
CN103310023A
CN103310023A CN2013102818388A CN201310281838A CN103310023A CN 103310023 A CN103310023 A CN 103310023A CN 2013102818388 A CN2013102818388 A CN 2013102818388A CN 201310281838 A CN201310281838 A CN 201310281838A CN 103310023 A CN103310023 A CN 103310023A
Authority
CN
China
Prior art keywords
burst
index
search
information
bursts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102818388A
Other languages
Chinese (zh)
Inventor
赵兴成
刘亚军
杨景慧
周辉
黄韶军
姜佰胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE ICT Technologies Co Ltd
Original Assignee
ZTE ICT Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE ICT Technologies Co Ltd filed Critical ZTE ICT Technologies Co Ltd
Priority to CN2013102818388A priority Critical patent/CN103310023A/en
Publication of CN103310023A publication Critical patent/CN103310023A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a distributed searching system comprises an index creating unit, an index fragmenting unit and a fragment searching unit, wherein the index creating unit is used for creating instructions according to received indexes and creating indexes for specified data; the index fragmenting unit is used for configuring data according to received fragments, dividing each index into a number of fragments and recording the information of the fragments; the fragment searching unit is used for determining the information of at least one fragment according to received searching conditions, searching at least one target fragment among a number of fragments according to the information of the fragment and returning data corresponding to the target fragment to users. The application further provides a distributed searching method. According to the technical scheme of the application, indexes can be fragmented; during searching, one or more corresponding fragments can be searched according to searching conditions; the response speed of searching can be improved by searching the fragments.

Description

Distributed search system and distributed search methods
Technical field
The present invention relates to the data searching technology field, in particular to a kind of distributed search system and a kind of distributed search methods.
Background technology
Burst growth along with network data, big data processing technique has become necessity operation that data are handled, in these technology, hadoop relies on its high stability, reliability, extensibility, become the standard of big data industry gradually, but also there is defective in hadoop aspect the processing in real time, can not the higher scene of requirement of real time, when in the database of hadoop, carrying out the mass data search, search speed is slower, and the demonstration of Search Results has bigger time delay with respect to the input of search operation, is difficult to satisfy the user to the demand of real-time search.
Therefore, need a kind of new search technique, can improve the response speed of mass data search, improve the real-time that mass data is searched for.
Summary of the invention
The present invention just is being based on the problems referred to above, has proposed a kind of search technique, can improve the response speed of mass data search, improves the real-time that mass data is searched for.
In view of this, the present invention proposes a kind of distributed search system, comprising: the index creation unit is used for according to the index creation instruction that receives, for specific data is created index; The index sharding unit is used for described index being divided into a plurality of bursts, and recording each fragment information of fragments in described a plurality of burst according to the burst configuration data that receives; The burst search unit, be used for determining at least one burst information according to the search condition that receives, and in a plurality of bursts, search at least one target burst according at least one burst information, with each target burst in described at least one target burst respectively corresponding data return to the user.
In this technical scheme, after to certain data creation index, index can be divided into a plurality of bursts, when search data, can directly in search condition, import the burst information that to search for, because an index is corresponding to a plurality of bursts, the corresponding data volume of each burst is less, thereby each burst is searched for, with respect to the corresponding chunk data of search index, can search the corresponding little blocks of data of each burst quickly, and with a plurality of bursts respectively corresponding little blocks of data return to the user, the speed that returns to the user with respect to the chunk data with the index correspondence is also very fast, thereby has improved the speed of data query and return results in the data search process, has improved the real-time of search data.
In technique scheme, preferably, described index creation unit also is used for generating index database according to the metadata information of the constructive process of described index; Described index sharding unit is used for described each fragment information of fragments is recorded in described index database; Described burst search unit is used for according to described search condition, determines described at least one burst information in described index database.
In this technical scheme, can generate corresponding metadata information in each index creation process, wherein putting down in writing the specifying information of index, such as the index corresponding data, the index position, information such as the ID of index, can be according to the index database of metadata creation index, thereby when the user imports search criterion, just can be convenient and find corresponding index rapidly in index database according to search criterion, after to the index burst, can also be stored in each fragment information of fragments in the index database, thereby can fast and accurately determine corresponding burst information according to search criterion, and then search corresponding burst.
In technique scheme, preferably, described burst configuration data comprises: burst quantity and/or burst node; Described burst information comprises: segmental identification and/or described burst node.
In this technical scheme, the user can arrange the burst configuration data as required, thereby the concrete burst quantity of index and the node (being burst residing position in server) that each burst is distributed to are set, burst information can comprise segmental identification and/or burst node, and namely the user can find corresponding burst by sign and/or the positional information of importing the burst that will search in search instruction.
In technique scheme, preferably, also comprise: the burst storage unit is used for according to preset algorithm described each burst being arranged into corresponding burst node respectively and stores.
In this technical scheme, after index is carried out burst, each burst need be arranged on the node of server, such as carrying out burst for the index among the hadoop, can be arranged on several nodes of hadoop server according to the burst of the intrinsic algorithm among the hadoop with index so, to finish the storage of burst.
In above-mentioned arbitrary technical scheme, preferably, described index sharding unit also is used for the burst to be expanded in described a plurality of bursts being divided into a plurality of sub-bursts, and recording each sub-fragment information of fragments in described a plurality of sub-burst according to the expansion burst instruction that receives.
In this technical scheme, can expand the burst quantity of index as required, specifically can adjust the burst quantity of index, again index is carried out burst, also can divide further one or more bursts, burst is divided into a plurality of sub-bursts, makes the user can search burst or the corresponding more burst of small data quantity of corresponding more big data quantity.
The application has also proposed a kind of distributed search methods, comprising: step 202, according to the index creation instruction that receives, for specific data is created index; Step 204, the burst configuration data according to receiving is divided into a plurality of bursts with described index, and records each fragment information of fragments in described a plurality of burst; Step 206, determine at least one burst information according to the search condition that receives, and in described a plurality of bursts, search at least one target burst according to described at least one burst information, with each target burst in described at least one target burst respectively corresponding data return to the user.
In this technical scheme, after to certain data creation index, index can be divided into a plurality of bursts, when search data, can directly in search condition, import the burst information that to search for, because an index is corresponding to a plurality of bursts, the corresponding data volume of each burst is less, thereby each burst is searched for, with respect to the corresponding chunk data of search index, can search the corresponding little blocks of data of each burst quickly, and with a plurality of bursts respectively corresponding little blocks of data return to the user, the speed that returns to the user with respect to the chunk data with the index correspondence is also very fast, thereby has improved the speed of data query and return results in the data search process, has improved the real-time of search data.
In technique scheme, preferably, described step 202 also comprises: generate index database according to the metadata information in the constructive process of described index; Then described step 204 comprises: described each fragment information of fragments is recorded in the described index database; Described step 206 comprises: according to described search condition, determine described at least one burst information in described index database.
In this technical scheme, can generate corresponding metadata information in each index creation process, wherein putting down in writing the specifying information of index, such as the index corresponding data, the index position, information such as the ID of index, can be according to the index database of metadata creation index, thereby when the user imports search criterion, just can be convenient and find corresponding index rapidly in index database according to search criterion, after to the index burst, can also be stored in each fragment information of fragments in the index database, thereby can fast and accurately determine corresponding burst information according to search criterion, and then search corresponding burst.
In technique scheme, preferably, described burst configuration data comprises: burst quantity and/or burst node; Described burst information comprises: segmental identification and/or described burst node.
In this technical scheme, the user can arrange the burst configuration data as required, thereby the concrete burst quantity of index and the node (being burst residing position in server) that each burst is distributed to are set, burst information can comprise segmental identification and/or burst node, and namely the user can find corresponding burst by sign and/or the positional information of importing the burst that will search in search instruction.
In technique scheme, preferably, described step 204 also comprises: according to preset algorithm described each burst is arranged into corresponding burst node respectively and stores.
In this technical scheme, after index is carried out burst, each burst need be arranged on the node of server, such as carrying out burst for the index among the hadoop, can be arranged on several nodes of hadoop server according to the burst of the intrinsic algorithm among the hadoop with index so, to finish the storage of burst.
In technique scheme, preferably, also comprise: the expansion burst instruction according to receiving is divided into a plurality of sub-bursts with the burst to be expanded in described a plurality of bursts, and records each sub-fragment information of fragments in described a plurality of sub-burst.
In this technical scheme, can expand the burst quantity of index as required, specifically can adjust the burst quantity of index, again index is carried out burst, also can divide further one or more bursts, burst is divided into a plurality of sub-bursts, makes the user can search burst or the corresponding more burst of small data quantity of corresponding more big data quantity.
By above technical scheme, can be behind the index of creating data, index is carried out burst, and then when carrying out search operation, can search corresponding one or more burst according to search condition, because the corresponding data volume of each burst is little with respect to the corresponding data volume of index, thereby can improve the response speed of search by the search burst.
Description of drawings
Fig. 1 shows the block diagram of distributed search system according to an embodiment of the invention;
Fig. 2 shows the process flow diagram of distributed search methods according to an embodiment of the invention;
Fig. 3 shows the synoptic diagram of creating burst and search burst according to an embodiment of the invention;
Fig. 4 A to Fig. 4 C shows the synoptic diagram of expanding burst according to an embodiment of the invention.
Embodiment
In order more to be expressly understood above-mentioned purpose of the present invention, feature and advantage, below in conjunction with the drawings and specific embodiments the present invention is further described in detail.Need to prove that under the situation of not conflicting, the application's embodiment and the feature among the embodiment can make up mutually.
A lot of details have been set forth in the following description so that fully understand the present invention; but; the present invention can also adopt other to be different from other modes described here and implement, and therefore, protection scope of the present invention is not subjected to the restriction of following public specific embodiment.
Fig. 1 shows the block diagram of distributed search system according to an embodiment of the invention.
As shown in Figure 1, distributed search system 100 comprises according to an embodiment of the invention: index creation unit 102 is used for according to the index creation instruction that receives, for specific data is created index; Index sharding unit 104 is used for index being divided into a plurality of bursts, and recording each fragment information of fragments in a plurality of bursts according to the burst configuration data that receives; Burst search unit 106, be used for determining at least one burst information according to the search condition that receives, and in described a plurality of bursts, search at least one target burst according to described at least one burst information, with each target burst at least one target burst respectively corresponding data return to the user.
After to certain data creation index, index can be divided into a plurality of bursts, when search data, can directly in search condition, import the burst information that to search for, because an index is corresponding to a plurality of bursts, the corresponding data volume of each burst is less, thereby each burst is searched for, with respect to the corresponding chunk data of search index, can search the corresponding little blocks of data of each burst quickly, and with a plurality of bursts respectively corresponding little blocks of data return to the user, the speed that returns to the user with respect to the chunk data with the index correspondence is also very fast, thereby improved the speed of data query and return results in the data search process, improved the real-time of search data.
Such as the data creation index that is 10G for a size, the corresponding size of data of this index is exactly 10G so, if this index is equally divided into 5 bursts, the corresponding size of data of each burst is 2G so, the user can search for one or several burst in 5 bursts as required, with respect to the data of search 10G, the data response speed of 1 to 5 2G of search is faster, and search accuracy also more can satisfy user's needs.
Preferably, index creation unit 102 also is used for the metadata information generation index database according to the constructive process of index; Index sharding unit 104 is used for each fragment information of fragments is recorded in index database; Burst search unit 106 is used for according to search condition, determines at least one burst information in index database.
Can generate corresponding metadata information in each index creation process, wherein putting down in writing the specifying information of index, such as the index corresponding data, the index position, information such as the ID of index, can be according to the index database of metadata creation index, thereby when the user imports search criterion, just can be convenient and find corresponding index rapidly in index database according to search criterion, after to the index burst, each fragment information of fragments also can be stored in the index database, thereby can fast and accurately determine corresponding burst information according to search criterion, and then search corresponding burst.
For last example, in the index database of the data of 10G, comprise each fragment information of fragments in 5 bursts, comprise such as first fragment information of fragments: ID is 10.1, the position is the 2nd server the 16th node, second fragment information of fragments comprises: ID is 10.2, the position is the 2nd server the 17th node, and the like, 5 fragment information of fragments of this index in index database, have been put down in writing, the user can directly import the ID of the burst that will inquire about, such as comprising 10.1 and 10.2 in the search condition, then searches first burst data corresponding with second burst.
Preferably, the burst configuration data comprises: burst quantity and/or burst node; Burst information comprises: segmental identification and/or burst node.
The user can arrange the burst configuration data as required, thereby the concrete burst quantity of index and the node (being burst residing position in server) that each burst is distributed to are set, burst information can comprise segmental identification and/or burst node, and namely the user can find corresponding burst by sign and/or the positional information of importing the burst that will search in search instruction.
Preferably, also comprise: burst storage unit 108 is used for according to preset algorithm each burst being arranged into corresponding burst node respectively and stores.
After index is carried out burst, each burst need be arranged on the node of server, such as carrying out burst for the index among the hadoop, can be arranged on several nodes of hadoop server according to the burst of the intrinsic algorithm among the hadoop with index so, to finish the storage of burst.
Preferably, index sharding unit 104 also is used for the burst to be expanded in a plurality of bursts being divided into a plurality of sub-bursts, and recording each sub-fragment information of fragments in a plurality of sub-bursts according to the expansion burst instruction that receives.
Can expand the burst quantity of index as required, specifically can adjust the burst quantity of index, again index is carried out burst, also can divide further one or more bursts, burst is divided into a plurality of sub-bursts, makes the user can search burst or the corresponding more burst of small data quantity of corresponding more big data quantity.
For last example, can further divide each burst in 5 bursts, such as dividing for first burst, obtain 3 sub-bursts, the corresponding ID of each burst can be 10.1.1 so, 10.1.2,10.1.3, thereby the user can import sub-fragment information of fragments and search for sub-burst in search instruction, further improve search accuracy, certainly, also can repartition the index of 10G data, at first delete corresponding each burst of this index, again this index is divided according to the burst quantity in the burst configuration data (can be a configuration file, be arranged by the user) that receives then, such as being divided into 3 bursts, the data volume of each burst correspondence is respectively 3G, 3G and 4G, thus the user can search for the burst of larger data amount.
Fig. 2 shows the process flow diagram of distributed search methods according to an embodiment of the invention.
As shown in Figure 2, distributed search methods comprises according to an embodiment of the invention: step 202, according to the index creation instruction that receives, for specific data is created index; Step 204, the burst configuration data according to receiving is divided into a plurality of bursts with index, and records each fragment information of fragments in a plurality of bursts; Step 206, determine at least one burst information according to the search condition that receives, and in a plurality of bursts, search at least one target burst according at least one burst information, with each target burst at least one target burst respectively corresponding data return to the user.
After to certain data creation index, index can be divided into a plurality of bursts, when search data, can directly in search condition, import the burst information that to search for, because an index is corresponding to a plurality of bursts, the corresponding data volume of each burst is less, thereby each burst is searched for, with respect to the corresponding chunk data of search index, can search the corresponding little blocks of data of each burst quickly, and with a plurality of bursts respectively corresponding little blocks of data return to the user, the speed that returns to the user with respect to the chunk data with the index correspondence is also very fast, thereby improved the speed of data query and return results in the data search process, improved the real-time of search data.
Such as the data creation index that is 10G for a size, the corresponding size of data of this index is exactly 10G so, if this index is equally divided into 5 bursts, the corresponding size of data of each burst is 2G so, the user can search for one or several burst in 5 bursts as required, with respect to the data of search 10G, the data response speed of 1 to 5 2G of search is faster, and search accuracy also more can satisfy user's needs.
Preferably, step 202 also comprises: generate index database according to the metadata information in the constructive process of index; Then step 204 comprises: each fragment information of fragments is recorded in the index database; Step 206 comprises: according to search condition, determine at least one burst information in index database.
Can generate corresponding metadata information in each index creation process, wherein putting down in writing the specifying information of index, such as the index corresponding data, the index position, information such as the ID of index, can be according to the index database of metadata creation index, thereby when the user imports search criterion, just can be convenient and find corresponding index rapidly in index database according to search criterion, after to the index burst, each fragment information of fragments also can be stored in the index database, thereby can fast and accurately determine corresponding burst information according to search criterion, and then search corresponding burst.
For last example, in the index database of the data of 10G, comprise each fragment information of fragments in 5 bursts, comprise such as first fragment information of fragments: ID is 10.1, the position is the 2nd server the 16th node, second fragment information of fragments comprises: ID is 10.2, the position is the 2nd server the 17th node, and the like, 5 fragment information of fragments of this index in index database, have been put down in writing, the user can directly import the ID of the burst that will inquire about, such as comprising 10.1 and 10.2 in the search condition, then searches first burst data corresponding with second burst.
Preferably, the burst configuration data comprises: burst quantity and/or burst node; Burst information comprises: segmental identification and/or burst node.
The user can arrange the burst configuration data as required, thereby the concrete burst quantity of index and the node (being burst residing position in server) that each burst is distributed to are set, burst information can comprise segmental identification and/or burst node, and namely the user can find corresponding burst by sign and/or the positional information of importing the burst that will search in search instruction.
Preferably, step 204 also comprises: according to preset algorithm each burst is arranged into corresponding burst node respectively and stores.
After index is carried out burst, each burst need be arranged on the node of server, such as carrying out burst for the index among the hadoop, can be arranged on several nodes of hadoop server according to the burst of the intrinsic algorithm among the hadoop with index so, to finish the storage of burst.
Preferably, also comprise: the expansion burst instruction according to receiving is divided into a plurality of sub-bursts with the burst to be expanded in a plurality of bursts, and records each sub-fragment information of fragments in a plurality of sub-bursts.
Can expand the burst quantity of index as required, specifically can adjust the burst quantity of index, again index is carried out burst, also can divide further one or more bursts, burst is divided into a plurality of sub-bursts, makes the user can search burst or the corresponding more burst of small data quantity of corresponding more big data quantity.
For last example, can further divide each burst in 5 bursts, such as dividing for first burst, obtain 3 sub-bursts, the corresponding ID of each burst can be 10.1.1 so, 10.1.2,10.1.3, thereby the user can import sub-fragment information of fragments and search for sub-burst in search instruction, further improve search accuracy, certainly, also can repartition the index of 10G data, at first delete corresponding each burst of this index, again this index is divided according to the burst quantity in the burst configuration data (can be a configuration file, be arranged by the user) that receives then, such as being divided into 3 bursts, the data volume of each burst correspondence is respectively 3G, 3G and 4G, thus the user can search for the burst of larger data amount.
Fig. 3 shows the synoptic diagram of creating burst and search burst according to an embodiment of the invention.
As shown in Figure 3, distributed search methods as shown in Figure 2 can be applied among the hadoop, hadoop can comprise index server 304 and search server 306, host node 302 is used for receiving user instruction by the foreground interface, and search content returned the foreground, by the index management interface establishment index operation of index server 304 is controlled simultaneously, and by the search management interface search operation of search server 306 is controlled, namenode wherein is used for providing the metadata service, and the node in the index server 304 and the datanode on the node in the search server 306 send control command.
At first create index for target data, then index is left in the index server 304 of hadoop, burst configuration data according to user's input, index is divided into a plurality of bursts, and be arranged on a plurality of nodes in the index server 304 and store, such as N burst is saved in index node 1 respectively to index node N, and there is the node of correspondence in index node 1 in search server 306 to index node N, be that node (N+1) is to node (N+N), each fragment information of fragments also can correspond to node (N+1) to node (N+N), when the request of user's inputted search, the searching request of search server 306 process user, determine the burst information that searching request (being search condition) comprises, such as according to the burst node in the searching request, the residing node of the definite burst that will search, inquire the information that comprises this burst node at node (N+1) to node (N+N) then, thereby search corresponding burst and the corresponding data of burst at the node of index server 304 correspondences.
The data that search can return to the user by the foreground interface, and slave node can be stored burst information in index server 304 and/or the search server 306, also can be used for the storage burst.
Fig. 4 A to Fig. 4 C shows the synoptic diagram of expanding burst according to an embodiment of the invention.
Shown in Fig. 4 A, the size of data A is 10G, for data A creates index, index ID is A, arrange according to the user this index on average is divided into 5 bursts, the size of data of each burst correspondence is 2G so, A.1 the ID of 5 burst correspondences be respectively, A.2, A.3, A.4, A.5, the user can directly import the ID of the burst that will search for, search corresponding burst, such as searching for burst ID:A.1, A.3 and A.4, then the data with these three burst correspondences return to the user as Search Results.
Shown in Fig. 4 B, can further divide the one or more bursts in already present 5 bursts according to user's extended instruction, burst is divided into 4 sub-bursts such as inciting somebody to action A.1, A.1.1 the ID of each sub-burst be followed successively by, A.1.2, A.1.3, A.1.4, thereby be equivalent to index A is divided into 8 bursts, the size of data of each burst correspondence is 0.5G, 0.5G, 0.5G, 0.5G, 2G, 2G, 2G, 2G, searches for the corresponding data of sub-burst thereby the user can import the ID of sub-burst.
Shown in Fig. 4 C, the user can also carry out burst to index A as required again, at first delete already present burst information in the server, then according to the burst configuration data that re-enters, index A is divided into 3 bursts, each burst respectively corresponding ID for A.11, A.12, A.13, each burst size of data of correspondence respectively is 3G, 3G, 4G, thus the user can search the data of required size as required.
More than be described with reference to the accompanying drawings technical scheme of the present invention, consider in the correlation technique, when in the database of hadoop, carrying out the mass data search, search speed is slower, the demonstration of Search Results has bigger time delay with respect to the input of search operation, is difficult to satisfy the user to the demand of real-time search.By technical scheme of the present invention, can improve the response speed of mass data search, improve the real-time that mass data is searched for.
In the present invention, term " first ", " second " only are used for describing purpose, and can not be interpreted as indication or hint relative importance.Term " a plurality of " refers to two or more, unless clear and definite restriction is arranged in addition.
The above is the preferred embodiments of the present invention only, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. a distributed search system is characterized in that, comprising:
The index creation unit is used for according to the index creation instruction that receives, for specific data is created index;
The index sharding unit is used for described index being divided into a plurality of bursts, and recording each fragment information of fragments in described a plurality of burst according to the burst configuration data that receives;
The burst search unit, be used for determining at least one burst information according to the search condition that receives, and in described a plurality of bursts, search at least one target burst according to described at least one burst information, with each target burst in described at least one target burst respectively corresponding data return to the user.
2. distributed search system according to claim 1 is characterized in that, described index creation unit also is used for generating index database according to the metadata information of the constructive process of described index; Described index sharding unit is used for described each fragment information of fragments is recorded in described index database; Described burst search unit is used for according to described search condition, determines described at least one burst information in described index database.
3. distributed search system according to claim 1 is characterized in that, described burst configuration data comprises: burst quantity and/or burst node; Described burst information comprises: segmental identification and/or described burst node.
4. distributed search system according to claim 3 is characterized in that, also comprises:
The burst storage unit is used for according to preset algorithm described each burst being arranged into corresponding burst node respectively and stores.
5. according to each described distributed search system in the claim 1 to 4, it is characterized in that, described index sharding unit also is used for according to the expansion burst instruction that receives, burst to be expanded in described a plurality of bursts is divided into a plurality of sub-bursts, and records each sub-fragment information of fragments in described a plurality of sub-burst.
6. a distributed search methods is characterized in that, comprising:
Step 202 is according to the index creation instruction that receives, for specific data is created index;
Step 204, the burst configuration data according to receiving is divided into a plurality of bursts with described index, and records each fragment information of fragments in described a plurality of burst;
Step 206, determine at least one burst information according to the search condition that receives, and in described a plurality of bursts, search at least one target burst according to described at least one burst information, with each target burst in described at least one target burst respectively corresponding data return to the user.
7. distributed search methods according to claim 6 is characterized in that, described step 202 also comprises: generate index database according to the metadata information in the constructive process of described index; Then described step 204 comprises: described each fragment information of fragments is recorded in the described index database; Described step 206 comprises: according to described search condition, determine described at least one burst information in described index database.
8. distributed search methods according to claim 6 is characterized in that, described burst configuration data comprises: burst quantity and/or burst node; Described burst information comprises: segmental identification and/or described burst node.
9. distributed search methods according to claim 8 is characterized in that, described step 204 also comprises: according to preset algorithm described each burst is arranged into corresponding burst node respectively and stores.
10. according to each described distributed search methods in the claim 6 to 9, it is characterized in that, also comprise: according to the expansion burst instruction that receives, burst to be expanded in described a plurality of bursts is divided into a plurality of sub-bursts, and records each sub-fragment information of fragments in described a plurality of sub-burst.
CN2013102818388A 2013-07-05 2013-07-05 Distributed searching system and method Pending CN103310023A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102818388A CN103310023A (en) 2013-07-05 2013-07-05 Distributed searching system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102818388A CN103310023A (en) 2013-07-05 2013-07-05 Distributed searching system and method

Publications (1)

Publication Number Publication Date
CN103310023A true CN103310023A (en) 2013-09-18

Family

ID=49135241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102818388A Pending CN103310023A (en) 2013-07-05 2013-07-05 Distributed searching system and method

Country Status (1)

Country Link
CN (1) CN103310023A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778252A (en) * 2015-04-16 2015-07-15 天脉聚源(北京)传媒科技有限公司 Index storage method and index storage device
CN106250565A (en) * 2016-08-30 2016-12-21 福建天晴数码有限公司 Querying method based on burst relevant database and system
CN106484694A (en) * 2015-08-25 2017-03-08 杭州华为数字技术有限公司 Full-text search method based on distributed data base and system
CN106528683A (en) * 2016-10-25 2017-03-22 深圳市盛凯信息科技有限公司 Index segmenting equalization based big data cloud search platform and method thereof
CN108427675A (en) * 2017-02-13 2018-08-21 阿里巴巴集团控股有限公司 Build the method and apparatus of index
CN109033398A (en) * 2018-08-02 2018-12-18 广州酷狗计算机科技有限公司 The method and apparatus of distribution node
CN110291515A (en) * 2017-02-13 2019-09-27 微软技术许可有限责任公司 Distributed index search in computing system
CN110990399A (en) * 2016-09-12 2020-04-10 杭州数梦工场科技有限公司 Index reconstruction method and device
CN112231501A (en) * 2020-10-20 2021-01-15 浙江大华技术股份有限公司 Portrait library data storage and retrieval method and device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169507A (en) * 2011-05-26 2011-08-31 厦门雅迅网络股份有限公司 Distributed real-time search engine
CN102722531A (en) * 2012-05-17 2012-10-10 北京大学 Query method based on regional bitmap indexes in cloud environment
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169507A (en) * 2011-05-26 2011-08-31 厦门雅迅网络股份有限公司 Distributed real-time search engine
CN102722531A (en) * 2012-05-17 2012-10-10 北京大学 Query method based on regional bitmap indexes in cloud environment
CN102779185A (en) * 2012-06-29 2012-11-14 浙江大学 High-availability distribution type full-text index method

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778252A (en) * 2015-04-16 2015-07-15 天脉聚源(北京)传媒科技有限公司 Index storage method and index storage device
CN104778252B (en) * 2015-04-16 2018-12-21 天脉聚源(北京)传媒科技有限公司 The storage method and device of index
CN106484694B (en) * 2015-08-25 2019-09-20 杭州华为数字技术有限公司 Full-text search method and system based on distributed data base
CN106484694A (en) * 2015-08-25 2017-03-08 杭州华为数字技术有限公司 Full-text search method based on distributed data base and system
CN106250565A (en) * 2016-08-30 2016-12-21 福建天晴数码有限公司 Querying method based on burst relevant database and system
CN106250565B (en) * 2016-08-30 2019-05-07 福建天晴数码有限公司 Querying method and system based on fragment relevant database
CN110990399B (en) * 2016-09-12 2023-04-28 杭州数梦工场科技有限公司 Reconstruction index method and device
CN110990399A (en) * 2016-09-12 2020-04-10 杭州数梦工场科技有限公司 Index reconstruction method and device
CN106528683A (en) * 2016-10-25 2017-03-22 深圳市盛凯信息科技有限公司 Index segmenting equalization based big data cloud search platform and method thereof
CN110291515A (en) * 2017-02-13 2019-09-27 微软技术许可有限责任公司 Distributed index search in computing system
CN108427675A (en) * 2017-02-13 2018-08-21 阿里巴巴集团控股有限公司 Build the method and apparatus of index
CN110291515B (en) * 2017-02-13 2023-08-15 微软技术许可有限责任公司 Distributed index searching in computing systems
CN109033398A (en) * 2018-08-02 2018-12-18 广州酷狗计算机科技有限公司 The method and apparatus of distribution node
CN109033398B (en) * 2018-08-02 2021-03-30 广州酷狗计算机科技有限公司 Method and device for distributing nodes
CN112231501A (en) * 2020-10-20 2021-01-15 浙江大华技术股份有限公司 Portrait library data storage and retrieval method and device and storage medium

Similar Documents

Publication Publication Date Title
CN103310023A (en) Distributed searching system and method
CN111008185B (en) Data sharing method, system and equipment
CN102169507B (en) Implementation method of distributed real-time search engine
US8719237B2 (en) Method and apparatus for deleting duplicate data
CN111078653B (en) Data storage method, system and equipment
CN107423422B (en) Spatial data distributed storage and search method and system based on grid
KR102031588B1 (en) Method and system for implementing index when saving file
WO2016101283A1 (en) Data processing method, apparatus and system
CN103218404B (en) A kind of multi-dimensional metadata management method based on associate feature and system
CN104408163B (en) A kind of data classification storage and device
CN111090618B (en) Data reading method, system and equipment
JP5886447B2 (en) Location independent files
CN103488687A (en) Searching system and searching method of big data
CN105100146A (en) Data storage method, device and system
KR20130020050A (en) Apparatus and method for managing bucket range of locality sensitivie hash
CN103455631A (en) Method, device and system for processing data
CN102402602A (en) B+ tree indexing method and device of real-time database
CN103914483B (en) File memory method, device and file reading, device
CN107357843B (en) Massive network data searching method based on data stream structure
CN103995855A (en) Method and device for storing data
CN102244758A (en) Video-recording-file-based data acquisition method and equipment
CN102915382A (en) Method and device for carrying out data query on database based on indexes
CN105808622A (en) File storage method and device
CN104679830A (en) File processing method and device
CN104539750A (en) IP locating method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20130918

RJ01 Rejection of invention patent application after publication