Method and search system that the search system index switches
Technical field
The present invention relates to the web search service technology, relate in particular to method and search system that a kind of search system index switches.
Background technology
Present web search service technology has been used in many aspects, for example is applied in the commodity text retrieval system of ecommerce.Fig. 1 is the structural representation of existing a kind of search system.Referring to Fig. 1, this search system comprises: index server, search server and transfer server.
Wherein comprise the index information generation unit of search engine in the index server, be used to generate the index information that is used for carrying out search service, that is: index, internal memory concordance list and attribute filter table, and these data are offered search server.Index wherein is meant: be all related datas that are used for file retrieval that a specific application program provides, document wherein is the raw data that index server is used for carrying out index; The attribute filter table is meant: with document identification (ID) but the table of the filter attribute of the commodity of ordering and the corresponding document id of the record of each bar wherein expression, but filter attribute can be the price of commodity, the date of production etc.; The internal memory concordance list is meant: unique ID of commodity is to the mapping table of the current document id of commodity.
Comprise search engine in the search server, have the function that full-text search, result's filtration, the instant deletion of record and index switch, be mainly used in according to the index information index server generation and that transmission comes and carry out search service.
Transfer server is mainly index server provides application relevant data source, and the searching request order of front end is transmitted to search server.
Because the data source in the network is to bring in constant renewal in to change, so index server needs to generate new index information according to data source in time, and the notice search server carries out index and switches.So-called index switches and is meant that search server obtains up-to-date index information from index server: index, internal memory concordance list and attribute filter table, and upgrade original index information in self with this, carry out search service with the index information after upgrading.
In the index handoff technique of existing search server; need switch index data simultaneously; attribute filter table and internal memory concordance list; for the index that guarantees search server in the process that index switches and the data in attribute filter table and the internal memory concordance list are consistent; upgrading index; when attribute filter table and internal memory concordance list; need carry out locking protection to index information; to prevent that search server from searching and current attribute filter table and the inconsistent data of internal memory concordance list; but search server can't access index information when locking protection, therefore can't carry out search service when index switches.This technical scheme is under the very big situation of index information, and the time that index switches can be very long, had a strong impact on the work of search server like this, causes search server can't carry out search service in long-time, has reduced the quality of search service.
Summary of the invention
In view of this, fundamental purpose of the present invention is to provide a kind of method of search system index switching, to reduce the influence of index switching to search service, improves the quality of search service.
Another object of the present invention to reduce the influence of index switching to search service, improves the quality of search service for a kind of search system is provided.
In order to realize the foregoing invention purpose, main technical schemes of the present invention is:
The method that a kind of search system index switches, default master catalogue that is used to store index information and standby catalogue and the main memory space and the spare memory space that load index information in search server, search server carries out search operation according to the index information in the master catalogue and the main memory space; The new index information that when carrying out the index switching index server is generated copies in the standby catalogue and is loaded in the spare memory space, afterwards master catalogue and standby catalogue is switched mutually, and main memory space and spare memory space are switched mutually.
Preferably, the described concrete grammar that master catalogue and standby catalogue are switched mutually is: the title of master catalogue and the title of standby catalogue are exchanged mutually.
Preferably, described index information comprises: index, attribute filter table and internal memory concordance list.
Preferably, this method further comprises: set suitable index switching time according to the size of index information.
Preferably, the described index information that loads in internal memory is attribute filter table and internal memory concordance list; The described concrete grammar that main memory space and spare memory space are switched mutually is: the internal memory first address of attribute filter table in exchange main memory space and the spare memory space, the internal memory first address of internal memory concordance list in exchange main memory space and the spare memory space.
Preferably, operation and the described search operation that described master catalogue and standby catalogue are switched mutually, switch mutually in main memory space and spare memory space added the mutual exclusion lock protection.
Preferably, described search system is the commodity text retrieval system of ecommerce.
Preferably, described index server generates the concrete grammar of new index information and is:
(1) be that one piece of document generates the inverted index data;
(2) carry out the internal memory index and merge, a collection of document in the merge memory generates a bigger data segment, and this section is stored in disk;
(3) index optimization regenerates the index after the new optimization;
(5) the attribute filter table of generation commodity;
(6) the internal memory concordance list of generation commodity.
A kind of search system, comprise index server, transfer server and search server, be provided with the master catalogue and the standby catalogue that are used to deposit index information in this search server, and main memory space and spare memory space, search server is according to master catalogue and active and standbyly carry out search operation with the index information in the memory headroom; Also comprise the index switch unit in the search server, be used for copying newly-generated index information to standby catalogue from index server, and be loaded into the spare memory space, and switch master catalogue and standby catalogue then, switch main memory space and spare memory space simultaneously.
Preferably, described index switch unit further comprises: the mutual exclusion lock unit, be used for when switching master catalogue and standby catalogue and main memory space and spare memory space, and the search operation of blocked operation and search engine is added mutual exclusion lock.
Preferably, described index server comprises that further record upgrades, optimizes the functional unit of index data, attribute filter table and generation of internal memory concordance list and copying data.
The present invention is placed on current index data, attribute filter table, internal memory concordance list under the master catalogue, and current attribute filter table, internal memory concordance list are carried in the main memory space; Switch new index data, attribute filter table, the internal memory concordance list of coming and be placed under the standby catalogue, switch new attribute filter table, the internal memory concordance list of coming and be carried in the spare memory space.Search system is carried out search service according to the data in master catalogue and the main memory space.When needs switch; the name that only exchanges two catalogues just can realize switching; after all preliminary works of switching are finished; system only need exchange the attribute filter table of index and the first address of internal memory concordance list gets final product; and only need when the first address exchange of two internal memories, carry out locking protection; and the time of this locking protection is very short, can not influence search service substantially, so the present invention can reduce the influence of index switching to search service system greatly.
In addition, search system of the present invention can be set the time that suitable index switches according to the size of index data, guaranteed that search system is under the situation that does not influence normal service, its index information can upgrade in time again, and the index after optimizing can improve the performance of search service to a certain extent, saves the disk space of search server.
Description of drawings
Fig. 1 is the structural representation of existing a kind of search system;
Fig. 2 is the process flow diagram of a kind of embodiment of the method for the invention;
Fig. 3 is the structural representation of a kind of embodiment of search system of the present invention.
Embodiment
Below by specific embodiments and the drawings the present invention is described in further details.
This paper is that example describes with the applied search system of commodity text retrieval system of ecommerce.
Set up two catalogues at search server among the embodiment of this paper: index catalogue and switch catalogue; The index catalogue is a master catalogue, and the switch catalogue is standby catalogue.And set up two memory headrooms: main memory space and spare memory space.The index information of the current use of search engine: index, attribute filter table, internal memory concordance list are placed under the index catalogue, and the attribute filter table and the internal memory concordance list of current use are carried in the main memory space.Index switches new index, attribute filter table, the internal memory concordance list of coming and is placed under the switch catalogue, and new attribute filter table and internal memory concordance list are carried in the spare memory space.Search server always carries out search operation according to the index information under the index catalogue and the main memory space.
Fig. 2 is the process flow diagram of a kind of embodiment of the method for the invention.Referring to Fig. 2, this flow process comprises:
Step 21, index server generate new index information according to data source.
This step specifically comprises: (1) is that one piece of document generates the inverted index data, and inverted index can the relevant sequencing information of recorded key speech, as the weights of key word at certain document, and position of appearance or the like, the row of falling is the mapping that keyword arrives lists of documents.(2) carry out the internal memory index and merge, a collection of document in the merge memory generates a bigger data segment, and this section is stored in disk.(3) index optimization.In the above in two steps, document joined index after, when carrying out deletion action, index can not deleted from index file immediately, but the document of deletion is placed in the file of " deletion ".When index was optimized, system can regenerate the index file (being index) after the new optimization.(5) the attribute filter table of generation commodity, the selectivity attribute information of the non-index of in store all commodity, these attributes are used for the result for retrieval of commodity is filtered in full-text search.(6) the internal memory concordance list of generation commodity.Generate the Hash Table mapping table of the outside unique ID of commodity, when search server receives that when being the record deletion order of transfer server, the current document id of inventory records is located fast by searching this table by system to the current document id of search system.
Step 22, index server send the index switching command to search server.
Step 23, search server are received the index switching command of index server, index information (comprising index, attribute filter table, internal memory concordance list) after copy is optimized from index server is under the Switch catalogue, and wherein newly-generated attribute filter table and the internal memory concordance list of general is loaded in the spare memory space.
Because shared time of execution in step 23 is longer, therefore the present invention is in search server execution in step 23, search server still utilizes the index information in the index catalogue and the old attribute filter table and the internal memory concordance list that have loaded in the main memory space carry out search operation, makes that the user still can the old index information of normal searching.
Step 24, search server switch switch catalogue and index catalogue mutually, are specially the directory name of exchange switch catalogue and index catalogue; Then main memory space and spare memory space are switched, be specially: the internal memory first address of old attribute filter table in new attribute filter table and the main memory space in the exchange spare memory space, the internal memory first address of old internal memory concordance list in new internal memory concordance list and the main memory space in the exchange spare memory space.Switching through active and standby catalogue and active and standby memory headroom, standby catalogue originally and spare memory space become current master catalogue and main memory space, and search server carries out search operation according to current master catalogue and main memory space all the time, so can make search operation always based on new index information.
In order to keep the consistance of current index and current attribute filter table data, two blocked operations of above-mentioned active and standby catalogue and active and standby memory headroom need to add the mutual exclusion lock protection with the search operation of search server.Because two operations described in this step take the CPU little time, the time that therefore adds the mutual exclusion lock protection is also very short, so than prior art, what influence the present invention does not have to search operation basically.
In actual applications, because index server can constantly generate new index information, so above-mentioned steps 21 is switched to the index of step 24 and is the circulation execution.Search system of the present invention can be set the time that suitable index switches according to the size of index information, guaranteed that search system is under the situation that does not influence normal service, its index information can upgrade in time again, and in step 21, index is optimized, index after the optimization can improve the performance of search service to a certain extent, saves the disk space of search server.
Fig. 3 is the structural representation of a kind of embodiment of search system of the present invention.Referring to Fig. 3, this search system comprises:
Index server 35 has the functional unit that record upgrades, optimizes index data, attribute filter table and generation of internal memory concordance list and copying data.Comprising index information generation unit 36, be used to generate new index information, and the notice search server carries out the index switching.
Transfer server 37, being mainly index server provides application relevant data source, and the searching request order of front end is transmitted to search server, and this transfer server 37 is identical with the transfer server of prior art, repeats no more herein.
Search server 38 is mainly used in the index information that generates according to index server and transmission comes and carries out search service.Comprising:
Search engine 30 has the function that full-text search, result's filtration, the instant deletion of record and index switch.
Be used to deposit the master catalogue 31 and the standby catalogue 32 of index information, and main memory space 33 and spare memory space 34.Search engine 30 carries out search operation according to the index information in master catalogue 31 and the main memory space 33.Be used to store index information in master catalogue 31 and the standby catalogue 32: index 301, attribute filter table 302 and internal memory concordance list 303; Main memory space 33 and spare memory space 34 are used to load attribute filter table 302 and internal memory concordance list 303.Search engine 30 is according to master catalogue 31 and active and standbyly carry out search operation with the index information in the memory headroom 33.
Further comprise index switch unit 35 in the search engine 30, be used for from the new index information of index server copy to standby catalogue 32, and load attribute filter table 302 and internal memory concordance list 303 to spare memory space 34; Switch master catalogue and standby catalogue then, switch main memory space and spare memory space simultaneously: promptly exchange the directory name of master catalogue and standby catalogue, and exchange the first address of attribute filter table in the primary, spare memory headroom and the first address of internal memory concordance list respectively.
Described index switch unit further comprises: mutual exclusion lock unit 304, be used for when switching master catalogue 31 and standby catalogue 32 and main memory space 33 with spare memory space 34, and the search operation of blocked operation and search engine 30 is added mutual exclusion lock.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.