CN106407376B - Index reconstruction method and device - Google Patents

Index reconstruction method and device Download PDF

Info

Publication number
CN106407376B
CN106407376B CN201610817528.7A CN201610817528A CN106407376B CN 106407376 B CN106407376 B CN 106407376B CN 201610817528 A CN201610817528 A CN 201610817528A CN 106407376 B CN106407376 B CN 106407376B
Authority
CN
China
Prior art keywords
index
fragment
copy
main
original data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610817528.7A
Other languages
Chinese (zh)
Other versions
CN106407376A (en
Inventor
牟宣理
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN201610817528.7A priority Critical patent/CN106407376B/en
Priority to CN201911129763.5A priority patent/CN110990399B/en
Publication of CN106407376A publication Critical patent/CN106407376A/en
Application granted granted Critical
Publication of CN106407376B publication Critical patent/CN106407376B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof

Abstract

The application provides a method and a device for reconstructing an index, wherein the method comprises the following steps: receiving a reconstruction index request carrying a new index structure, determining a main fragment corresponding to the reconstruction index request, and creating a copy fragment for the main fragment; copying the original data recorded in the main fragment to the sub-fragment, and establishing an index for the original data according to the new index structure to obtain new index data; and deleting the main fragment and switching the copy fragment into the main fragment. According to the method and the device, the original data of the main fragment are synchronized into the corresponding copy fragments to reconstruct the index of the original data, and the process that an external client reads and writes the original data through a query request and a write request is not needed, so that the consumption of network resources is reduced.

Description

Index reconstruction method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for reconstructing an index.
Background
The ElasticSearch is a Lucene-based search server. The search server comprises a plurality of index libraries, the client writes the original data and the index structure into the index libraries, and then the search server establishes indexes for the original data according to the index structure to obtain index data. However, when the index structure changes, the old index data cannot meet the requirements of the user, and the search server needs to reconstruct the index from the original data of the index database to obtain new index data.
In the prior art, a search server reconstructs a new index base for the old index base, original data in the old index base is read by a client, the read original data and a new index structure are written into the new index base, then an index is reconstructed on the original data according to the new index structure to obtain new index data, the old index base is switched to the new index base by modifying an index alias mode, and the old index base is deleted. However, the client needs to transmit through the network in the process of reading and writing the original data, so that the network resource consumption is high, and the index reconstruction efficiency is low.
Disclosure of Invention
In view of the above, the present application provides a method and an apparatus for reconstructing an index, so as to solve the problem of low index reconstruction efficiency in the conventional reconstruction method.
According to a first aspect of embodiments of the present application, there is provided a method for reconstructing an index, the method including:
receiving a reconstruction index request carrying a new index structure, determining a main fragment corresponding to the reconstruction index request, and creating a copy fragment for the main fragment;
copying the original data recorded in the main fragment to the sub-fragment, and establishing an index for the original data according to the new index structure to obtain new index data;
and deleting the main fragment and switching the copy fragment into the main fragment.
According to a second aspect of embodiments of the present application, there is provided a reconstruction index apparatus, the apparatus including:
a receiving unit, configured to receive a reconstruction index request carrying a new index structure;
a creating unit, configured to determine a main partition corresponding to the reestablishment index request, and create a copy partition for the main partition;
a synchronization unit, configured to copy original data recorded in the main slice to the copy slice;
the index establishing unit is used for establishing an index for the original data according to the new index structure to obtain new index data;
and the switching unit is used for deleting the main fragment and switching the copy fragment into the main fragment.
By applying the embodiment of the application, when receiving a reconstruction index request carrying a new index structure, the search server determines a main fragment corresponding to the reconstruction index request and creates a copy fragment for the main fragment; copying the original data recorded in the main fragment to the copy fragment, and establishing an index for the original data according to a new index structure to obtain new index data; and deleting the main fragment and switching the copy fragment into the main fragment. Based on the implementation mode, the search server reconstructs the index for the original data by synchronizing the original data of the main fragment to the corresponding copy fragment, and an external client does not need to read and write the original data through a query request and a write request, so that the consumption of network resources is reduced.
Drawings
FIG. 1 is a diagram illustrating an exemplary prior art reconstruction index structure according to an exemplary embodiment of the present application;
FIG. 2A is a flowchart illustrating an embodiment of a method for reconstructing an index according to an exemplary embodiment of the present application;
FIG. 2B is an exemplary master shard and copy shard distribution diagram in the embodiment shown in FIG. 2A;
FIG. 2C is a diagram of an exemplary master slice synchronization data process in the embodiment shown in FIG. 2A;
FIG. 3 is a diagram illustrating a hardware configuration of a search server according to an exemplary embodiment of the present application;
fig. 4 is a block diagram illustrating an embodiment of a reconstruction index device according to an exemplary embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Fig. 1 is a diagram illustrating an exemplary existing reconstructed index structure according to an exemplary embodiment of the present application, where as shown in fig. 1, a search server includes three search nodes 1, node2, and node3, an old index base a includes three main shards a1, a2, and A3, and a new reconstructed index base a 'creates three main shards a' 1, a '2, and a' 3, respectively. A client sends a query request carrying an index library A identifier to a search server, the search server selects a search node, such as node1, from the index library A, sends the query request to node1, node1 forwards the query request to search node3 where search node2 and A3 where A2 is located respectively, node2 and node3 send original data 2 recorded by A2 and original data 3 recorded by A3 to node1 respectively, and node1 summarizes original data 1, original data 2 and original data 3 recorded by A1 and returns the summarized data to the client; then the client sends a write request carrying the original data 1, the original data 2, the original data 3 and the new index structure to the search server, the search server sends the original data 1, the original data 2, the original data 3 and the new index structure to the node1, the node1 copies two new index structures, and stores original data 1 and a new copy of the index structure into a '1, and sends original data 2 and a new copy of the index structure to node2, node2 stores original data 2 and a new copy of the index structure into a' 2, and sends original data 3 and a new index structure to node3, node3 stores original data 3 and a new index structure in A' 3, each search node builds an index on each master partition according to the new index structure, when node1 receives the write success responses returned by node2 and node3, returning the write success response to the client; and finally, the client controls the search server to switch the index library A to the index library A' in a mode of modifying the alias of the index, and deletes the index library A.
Therefore, the existing index rebuilding process needs to be assisted by an external client, the external client needs to read original data first and then write the original data and a new index structure into a new index database, so that a large amount of network resources are consumed to transmit data, query requests and write requests are operated on the same search node, the original data and the new index structure are forwarded to other search nodes by the search node, and the operation efficiency is low.
Fig. 2A is a flowchart of an embodiment of a method for reconstructing an index according to an exemplary embodiment of the present application, where the embodiment is applied to a search server, in the embodiment of the present application, the search server is a distributed server, and includes a plurality of search nodes, and the search server includes an index library, the index library includes a plurality of main partitions, each main partition may be distributed on a different search node or may be distributed on the same search node, and each main partition corresponds to an index structure and records original data and index data, as shown in fig. 2A, the embodiment includes the following steps:
step 201: receiving a reconstruction index request carrying a new index structure, determining a main fragment corresponding to the reconstruction index request, and creating a copy fragment for the main fragment.
When the index structure changes, the original index data in the search server cannot meet the use requirements of users, the search server is required to reestablish the index data, a reestablishing index request is sent to the search server through an external client to trigger a reestablishing index process of the search server, and a reestablishing index request sent by the external client can carry a new index structure to reestablish the index by using the new index structure.
For the process of determining the main fragment corresponding to the reestablishment index request and creating the copy fragment for the main fragment, the reestablishment index request may further carry an index base identifier, and the search server may first obtain the index base corresponding to the index base identifier, determine all the main fragments in the index base as the main fragments corresponding to the reestablishment index request, and then create the corresponding copy fragment for each main fragment in the index base.
The search server comprises a plurality of index bases, each index base stores different types of data, and different index bases can be distinguished through index base identification, so that the index reconstruction request needs to carry index base identification to indicate the index base needing to be reconstructed, and the index base identification can also be called an index alias. In addition, the index library includes a plurality of main fragments, each main fragment stores a part of data of the index library, and the data may be evenly distributed on each main fragment, so that all main fragments in the index library need to be used as objects for reconstructing an index, that is, all main fragments in the index library are determined as the main fragments corresponding to the request for reconstructing an index.
For the process of creating a copy shard for the master shard, in one example, the search server may create a corresponding copy shard for the master shard at a search node where the master shard is located.
Since the search server includes a plurality of search nodes, each main partition in the index repository may be distributed on different search nodes or may be distributed on the same search node, and usually the main partitions in the index repository are distributed on the search nodes in a load balancing manner, for example, the search server has 3 search nodes, and the index repository includes 3 main partitions, and then the 3 main partitions are respectively distributed on different search nodes, therefore, in order to reduce communication consumption between the search nodes, the search server may create a corresponding copy partition for the main partition on the search node where the main partition is located, and each main partition is on a different search node, and may create a corresponding copy partition at the same time, thereby improving the efficiency of reconstructing the index. The following description is given as an example.
Fig. 2B is an exemplary distribution diagram of main slices and copy slices in the embodiment shown in fig. 2A, and as shown in fig. 2B, the index library includes 3 main slices a0, a1, a2, each of which is located at a different search node, a0 is at search node1, a1 is at search node2, a2 is at search node3, and the copy slices created by the search server for the 3 main slices are at the search node where each main slice is located, which are a0 ', a1 ', a2 ', respectively.
Based on the description of this example, after the search server obtains the corresponding index library through the index library identifier, the copy fragment created for each main fragment of the index library is the copy fragment created on the search node where each main fragment is located, and in the subsequent steps, the interaction between the main fragments and the copy fragments can be realized without consuming communication resources between the search nodes, so that the communication consumption between the search nodes can be reduced, and in addition, the corresponding copy fragments can be created for the main fragments on different search nodes at the same time, so that the index reconstruction efficiency is improved.
In another example, the search server may also create a corresponding copy shard for the primary shard at a search node other than the search node at which the primary shard is located.
Step 202: and copying the original data recorded in the main fragment to the copy fragment, and establishing an index for the original data according to a new index structure to obtain new index data.
Specifically, the search server synchronizes, at each search node, the original data of the main segment included in the index repository to the corresponding copy segment, and establishes an index for the original data in the copy segment according to a new index structure to obtain new index data.
It should be noted that, after creating a copy fragment for a main fragment, the search server may create a reconstruction index identifier for the copy fragment to distinguish from existing copy fragments. When the original data recorded in the main fragment needs to be copied to the copy fragment, the search server can judge whether the copy fragment has a reconstruction index identifier; if so, copying the original data recorded by the main fragment onto the copy fragment, and if not, copying the original data recorded by the main fragment and the index data together onto the copy fragment.
When the main fragment is used for synchronizing data, if the copy fragment has a reconstruction index identifier, the main fragment indicates that the copy fragment is used for reconstructing an index, and only original data in the main fragment is copied so as to reestablish the index by using a new index structure to obtain new index data; if the index identifier is not rebuilt, the copy fragment is an existing copy fragment, and both the original data and the index data in the main fragment need to be copied into the copy fragment.
As shown in step 201, fig. 2C is an exemplary main slice synchronization data process diagram in the embodiment shown in fig. 2A, and referring to fig. 2B and fig. 2C, when the main slice a0 synchronizes data, because the corresponding copy slice a0 'has a reconstruction index identifier, only the original data 1 on a0 needs to be copied to a 0', and the processes of synchronizing data from a1 to a1 'and synchronizing data from a2 to a 2' are similar and will not be described again.
A process of establishing an index for the original data according to a new index structure to obtain new index data, where the new index structure may include a field, a type corresponding to the field, and a word segmentation manner, the original data includes a plurality of pieces of original data, and for each piece of original data, the search server may obtain data content corresponding to the field from the piece of original data and set the data content as the type corresponding to the field, and then perform word segmentation processing on the data content according to a lexicon corresponding to the word segmentation manner corresponding to the field to obtain a word segmentation result; summarizing word segmentation results obtained by all the sub-original data, and storing the summarized results into an index data table corresponding to the field, wherein the summarized results comprise vocabularies of data contents appearing in a word bank and identifications of all the sub-original data corresponding to each vocabulary.
The new index structure may include a plurality of fields, and each field corresponds to a word segmentation mode and a type, for example, the word segmentation mode includes a chinese word segmentation mode, an english word segmentation mode, a pinyin word segmentation mode, and the like, and the type corresponding to the field includes a character type, a date type, a numerical value type, and the like. Each word segmentation mode corresponds to a word stock, for example, a chinese word stock corresponding to a chinese word segmentation mode, an english word stock corresponding to an english word segmentation mode, and the like. The original data is composed of a plurality of pieces of sub original data, and each piece of sub original data is correspondingly provided with an identifier. The following description is given as an example.
For example, the original data stored in the copy fragment is two documents, an identifier of one document is 000001, an identifier of the other document is 000002, each piece of sub-original data corresponds to one document, therefore, an identifier of one piece of sub-original data may be 000001, an identifier of the other piece of sub-original data may be 000002, a field corresponding to the original data includes Filename, Title, Body, and Author, where the sub-original data identified as 000001 specifically is:
{
filename: "patent writing. doc";
title: "Key precautions for patent writing";
body: "there are a plurality of cautions in the patent writing process, and the following description is omitted. ";
the Author: "tensile strength";
}
the sub-raw data identified as 000002 are specifically:
{
filename: "test data. doc";
title: "patent protection";
body: "to pay attention to protect intellectual property of a company";
the Author: "Liguang";
}
taking the field title as an example, performing word segmentation processing on the data content corresponding to the field title, wherein the word segmentation result obtained by the sub-raw data marked as 000001 is "patent", "writing", "key", "attention", "item"; the participle results obtained from the child raw data labeled 000002 are "patent" and "protection", and after summarizing, the summarized results are obtained and stored in the index data table of the field title, as shown in table 1, the participle results after summarizing are "patent", "writing", "key", "attention", "item" and "protection", and the document list corresponding to each participle is: the corresponding documents of the patent are 000001, 000002; "write" corresponds to a document of 000001; "of" corresponds to a document of 000001; the "key" corresponds to a document of 000001; "Note" corresponds to a document of 000001; the document corresponding to "matter" is 000001; the "protection" corresponds to a document of 000002.
Word segmentation result Document list corresponding to word segmentation
Patent application 000001、000002
Writing 000001
Is/are as follows 000001
Key points 000001
Attention is paid to 000001
Matters and matters 000001
Protection of 000002
TABLE 1
It should be noted that, if the word stock is also updated, the search server may first import the new word stock into each search node for reconstructing the index before building the index for the original data according to the new index structure.
Based on the description in step 202, the search server sets the rebuilt index identifier on the copy fragment to distinguish from the existing copy fragment, so that when synchronizing data, only the original data in the main fragment may be copied to the copy fragment carrying the rebuilt index identifier, and the process of synchronizing data on each search node may be performed simultaneously, without the client sending the original data to a new index library through a query request and a write request, thereby reducing the consumption of network resources. And each searching node can operate simultaneously, namely each main fragment and each copy fragment can perform a data synchronization process simultaneously, so that the communication consumption between the searching nodes is reduced, and the index reconstruction efficiency is improved.
Step 203: and deleting the main fragment and switching the copy fragment into the main fragment.
Specifically, the search server deletes the main fragment of the index repository on each search node, and switches the copy fragment carrying the reconstructed index identifier to the main fragment, thereby completing the reconstructed index process of the index repository.
It should be noted that, after performing step 201 and before performing step 202, the search server may set the state identifier of the copy fragment to the first identifier, and after performing step 202, set the state identifier of the copy fragment to the second identifier. Therefore, when a search request is received, the search server can inquire the state identifier of the copy fragment; if the state identifier is the first identifier, acquiring index data corresponding to the search request from a main fragment corresponding to the copy fragment; and if the state identifier is the second identifier, acquiring the index data corresponding to the search request from the copy fragment.
The search server queries that the state identifier of the copy fragment is the first identifier in the index library corresponding to the index library identifier, which indicates that the copy fragment is unavailable, and may query index data corresponding to the content to be searched from the main fragment corresponding to the copy fragment.
As can be seen from the foregoing embodiment, when receiving a reestablishment index request carrying a new index structure, a search server determines a main fragment corresponding to the reestablishment index request, and creates a copy fragment for the main fragment; copying the original data recorded in the main fragment to the copy fragment, and establishing an index for the original data according to a new index structure to obtain new index data; and deleting the main fragment and switching the copy fragment into the main fragment. Based on the implementation mode, the search server reconstructs the index for the original data by synchronizing the original data of the main fragment to the corresponding copy fragment, and an external client does not need to read and write the original data through a query request and a write request, so that the consumption of network resources is reduced.
Corresponding to the foregoing embodiments of the reconstruction index method, the present application also provides embodiments of a reconstruction index device.
The embodiment of the index rebuilding device can be applied to a search server. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor of the device where the software implementation is located as a logical means. From a hardware aspect, as shown in fig. 3, a hardware structure diagram of a search server according to an exemplary embodiment of the present application is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 3, a device in which the apparatus in the embodiment is located may also include other hardware according to an actual function of the device, which is not described again.
Fig. 4 is a block diagram of an embodiment of a reconstruction index apparatus according to an exemplary embodiment of the present application, where the embodiment is applied to a search server, the search server includes a main partition, the main partition corresponds to an index structure, and original data and index data are recorded in the main partition, as shown in fig. 4, the apparatus includes: a receiving unit 410, a creating unit 420, a synchronizing unit 430, an establishing index unit 440, and a switching unit 450.
The receiving unit 410 is configured to receive a request for reconstructing an index that carries a new index structure;
a creating unit 420, configured to determine a main partition corresponding to the reestablishment index request, and create a copy partition for the main partition;
a synchronization unit 430, configured to copy the original data recorded in the main slice to the copy slice;
an index establishing unit 440, configured to establish an index for the original data according to the new index structure to obtain new index data;
a switching unit 450, configured to delete the main fragment and switch the copy fragment to the main fragment.
In an optional implementation manner, the search server includes an index base, the index base includes a plurality of main fragments, the rebuild index request further carries an index base identifier, and the creating unit 420 is specifically configured to obtain the index base corresponding to the index base identifier, and determine all the main fragments in the index base as the main fragments corresponding to the rebuild index request; a corresponding copy shard is created for each master shard in the index repository.
In another optional implementation manner, the search server includes a plurality of search nodes, the main partition is located on one search node, and the creating unit 420 is further specifically configured to create, on the search node where the main partition is located, a corresponding copy partition for the main partition in the process of creating a copy partition for the main partition; or, on other search nodes except the search node where the main fragment is located, creating corresponding copy fragments for the main fragment.
In another alternative implementation, the apparatus further comprises (not shown in fig. 4):
a search query unit, configured to set, after the creating unit 420 creates a copy fragment for the master fragment, a state identifier of the copy fragment as a first identifier before the indexing unit 440 creates an index for the original data according to the new index structure; after the indexing unit 440 indexes the original data according to the new index structure, setting the state identifier of the copy fragment as a second identifier; when a search request is received, inquiring the state identification of the copy fragment; if the state identifier is a first identifier, acquiring index data corresponding to the search request from a main partition corresponding to the copy partition; and if the state identifier is a second identifier, acquiring index data corresponding to the search request from the copy fragment.
In another alternative implementation, the apparatus further comprises (not shown in fig. 4):
a reconstruction index identification unit, specifically configured to create a reconstruction index identification for the copy fragment after the creation unit 420 creates the copy fragment for the main fragment; when original data recorded in the main fragment needs to be copied to a copy fragment, judging whether the copy fragment has a reconstruction index identifier; if so, copying the original data recorded by the main fragment to the auxiliary fragment; if not, copying the original data and index data of the main slice record to the sub slice.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
As can be seen from the foregoing embodiment, when receiving a reestablishment index request carrying a new index structure, a search server determines a main fragment corresponding to the reestablishment index request, and creates a copy fragment for the main fragment; copying the original data recorded in the main fragment to the copy fragment, and establishing an index for the original data according to a new index structure to obtain new index data; and deleting the main fragment and switching the copy fragment into the main fragment. Based on the implementation mode, the search server reconstructs the index for the original data by synchronizing the original data of the main fragment to the corresponding copy fragment, and an external client does not need to read and write the original data through a query request and a write request, so that the consumption of network resources is reduced.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (10)

1. A method for reconstructing an index, the method being applied to a search server, wherein the search server includes a main partition, the main partition corresponds to an index structure, and original data and index data are recorded in the main partition, the method comprising:
receiving a reconstruction index request carrying a new index structure, determining a main fragment corresponding to the reconstruction index request, creating a copy fragment for the main fragment, and creating a reconstruction index identifier for the copy fragment;
when original data recorded in the main fragment needs to be copied to a copy fragment, judging whether the copy fragment has a reconstruction index identifier;
if so, copying the original data recorded in the main fragment to the auxiliary fragment, and establishing an index for the original data according to the new index structure to obtain new index data;
and deleting the main fragment and switching the copy fragment into the main fragment.
2. The method according to claim 1, wherein the search server includes an index repository, the index repository includes a plurality of primary fragments, the rebuild index request further carries an index repository identifier, and the process of determining the primary fragment corresponding to the rebuild index request and creating the replica fragment for the primary fragment specifically includes:
acquiring an index base corresponding to the index base identification, and determining all main fragments in the index base as the main fragments corresponding to the index reconstruction request;
a corresponding copy shard is created for each master shard in the index repository.
3. The method according to claim 1, wherein the search server includes a plurality of search nodes, the primary partition is located on one search node, and the process of creating a copy partition for the primary partition specifically includes:
on a search node where the main fragment is located, creating a corresponding copy fragment for the main fragment; or, on other search nodes except the search node where the main fragment is located, creating corresponding copy fragments for the main fragment.
4. The method of claim 1, further comprising:
after a copy fragment is created for the main fragment, before an index is created for the original data according to the new index structure, setting a state identifier of the copy fragment as a first identifier; after indexes are established for the original data according to the new index structure, setting the state identifier of the copy fragment as a second identifier;
when a search request is received, inquiring the state identification of the copy fragment;
if the state identifier is a first identifier, acquiring index data corresponding to the search request from a main partition corresponding to the copy partition;
and if the state identifier is a second identifier, acquiring index data corresponding to the search request from the copy fragment.
5. The method of claim 1, further comprising:
when the original data recorded in the main fragment needs to be copied to a copy fragment, after judging whether the copy fragment has a reconstruction index identifier, if not, copying the original data and the index data recorded in the main fragment to the copy fragment.
6. A reconstruction index apparatus, the apparatus being applied to a search server, wherein the search server includes a main partition, the main partition corresponds to an index structure, and records original data and index data, the apparatus comprising:
a receiving unit, configured to receive a reconstruction index request carrying a new index structure;
the creating unit is used for determining a main fragment corresponding to the reconstruction index request, creating a copy fragment for the main fragment, and creating a reconstruction index identifier for the copy fragment;
the synchronization unit is used for judging whether the copy fragment has a reconstruction index identifier or not when original data recorded in the main fragment needs to be copied to the copy fragment; if so, copying the original data recorded in the main fragment to the copy fragment;
the index establishing unit is used for establishing an index for the original data according to the new index structure to obtain new index data;
and the switching unit is used for deleting the main fragment and switching the copy fragment into the main fragment.
7. The apparatus of claim 6, wherein the search server comprises an index database, the index database comprises a plurality of primary fragments, the rebuild index request further carries an index database identifier,
the creating unit is specifically configured to obtain an index library corresponding to the index library identifier, and determine all primary partitions in the index library as primary partitions corresponding to the reconstruction index request; a corresponding copy shard is created for each master shard in the index repository.
8. The apparatus of claim 6, wherein the search server comprises a plurality of search nodes, wherein the primary partition is located on one search node,
the creating unit is further specifically configured to create, on a search node where the main partition is located, a corresponding copy partition for the main partition in a process of creating a copy partition for the main partition; or, on other search nodes except the search node where the main fragment is located, creating corresponding copy fragments for the main fragment.
9. The apparatus of claim 6, further comprising:
a search query unit, configured to set, after the creating unit creates a copy fragment for the master fragment, a state identifier of the copy fragment as a first identifier before the indexing unit creates an index for the original data according to the new index structure; after the index establishing unit establishes an index for the original data according to the new index structure, setting the state identifier of the copy fragment as a second identifier; when a search request is received, inquiring the state identification of the copy fragment; if the state identifier is a first identifier, acquiring index data corresponding to the search request from a main partition corresponding to the copy partition; and if the state identifier is a second identifier, acquiring index data corresponding to the search request from the copy fragment.
10. The apparatus of claim 6, further comprising:
and the reconstructed index identification unit is specifically configured to, when original data recorded in the main fragment needs to be copied to a copy fragment, judge whether the copy fragment has a reconstructed index identification, and if not, copy the original data and the index data recorded in the main fragment to the copy fragment.
CN201610817528.7A 2016-09-12 2016-09-12 Index reconstruction method and device Active CN106407376B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610817528.7A CN106407376B (en) 2016-09-12 2016-09-12 Index reconstruction method and device
CN201911129763.5A CN110990399B (en) 2016-09-12 2016-09-12 Reconstruction index method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610817528.7A CN106407376B (en) 2016-09-12 2016-09-12 Index reconstruction method and device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201911129763.5A Division CN110990399B (en) 2016-09-12 2016-09-12 Reconstruction index method and device

Publications (2)

Publication Number Publication Date
CN106407376A CN106407376A (en) 2017-02-15
CN106407376B true CN106407376B (en) 2019-12-20

Family

ID=57999212

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201610817528.7A Active CN106407376B (en) 2016-09-12 2016-09-12 Index reconstruction method and device
CN201911129763.5A Active CN110990399B (en) 2016-09-12 2016-09-12 Reconstruction index method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201911129763.5A Active CN110990399B (en) 2016-09-12 2016-09-12 Reconstruction index method and device

Country Status (1)

Country Link
CN (2) CN106407376B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609865B (en) * 2018-05-29 2022-04-15 优信拍(北京)信息科技有限公司 Information synchronization method, device and system
CN110765125B (en) * 2018-07-25 2022-09-20 杭州海康威视数字技术股份有限公司 Method and device for storing data
CN110321322B (en) * 2019-07-02 2023-07-14 深信服科技股份有限公司 Data reconstruction method, device, equipment and computer readable storage medium
CN110442645B (en) * 2019-07-11 2020-09-15 新华三大数据技术有限公司 Data indexing method and device
CN111061431B (en) * 2019-11-28 2023-06-23 曙光信息产业股份有限公司 Distributed storage method, server and client

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196935A (en) * 2008-01-03 2008-06-11 中兴通讯股份有限公司 System and method for creating index database
CN102779160A (en) * 2012-06-14 2012-11-14 中金数据系统有限公司 Mass data information indexing system and indexing construction method
CN103258036A (en) * 2013-05-15 2013-08-21 广州一呼百应网络技术有限公司 Distributed real-time search engine based on p2p
CN103605657A (en) * 2013-10-14 2014-02-26 华为技术有限公司 Method and device for reconstructing index online
CN104156367A (en) * 2013-05-14 2014-11-19 阿里巴巴集团控股有限公司 Search engine capacity expansion method and search service system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101295323B (en) * 2008-06-30 2011-11-02 腾讯科技(深圳)有限公司 Processing method and system for index updating
US8239389B2 (en) * 2008-09-29 2012-08-07 International Business Machines Corporation Persisting external index data in a database
JP2013073557A (en) * 2011-09-29 2013-04-22 Hitachi Solutions Ltd Information search system, search server and program
CN103198108B (en) * 2013-03-27 2016-08-10 新浪网技术(中国)有限公司 A kind of index data update method, retrieval server and system
CN103310023A (en) * 2013-07-05 2013-09-18 深圳中兴网信科技有限公司 Distributed searching system and method
CN103488687A (en) * 2013-09-02 2014-01-01 用友软件股份有限公司 Searching system and searching method of big data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196935A (en) * 2008-01-03 2008-06-11 中兴通讯股份有限公司 System and method for creating index database
CN102779160A (en) * 2012-06-14 2012-11-14 中金数据系统有限公司 Mass data information indexing system and indexing construction method
CN104156367A (en) * 2013-05-14 2014-11-19 阿里巴巴集团控股有限公司 Search engine capacity expansion method and search service system
CN103258036A (en) * 2013-05-15 2013-08-21 广州一呼百应网络技术有限公司 Distributed real-time search engine based on p2p
CN103605657A (en) * 2013-10-14 2014-02-26 华为技术有限公司 Method and device for reconstructing index online

Also Published As

Publication number Publication date
CN106407376A (en) 2017-02-15
CN110990399B (en) 2023-04-28
CN110990399A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN106407376B (en) Index reconstruction method and device
US7680852B2 (en) Search processing method and search system
CN110502507B (en) Management system, method, equipment and storage medium of distributed database
US9792340B2 (en) Identifying data items
US8683112B2 (en) Asynchronous distributed object uploading for replicated content addressable storage clusters
US8396938B2 (en) Providing direct access to distributed managed content
CN102110121B (en) A kind of data processing method and system thereof
US8527556B2 (en) Systems and methods to update a content store associated with a search index
US9515878B2 (en) Method, medium, and system for configuring a new node in a distributed memory network
US20090210429A1 (en) System and method for asynchronous update of indexes in a distributed database
JP2019519025A (en) Division and movement of ranges in distributed systems
CN102741800A (en) Storage system for eliminating duplicated data
CN111078667B (en) Data migration method and related device
CN105574187A (en) Duplication transaction consistency guaranteeing method and system for heterogeneous databases
JP2009259007A (en) Distributed storage method, distributed storage system and distributed storage device
CN117043763A (en) Volatile database cache in database accelerator
EP3998535A1 (en) Air freight rate data caching method and system
JP5684671B2 (en) Condition retrieval data storage method, condition retrieval database cluster system, dispatcher, and program
US10025680B2 (en) High throughput, high reliability data processing system
CN113377763A (en) Database table switching method and device, electronic equipment and computer storage medium
CN117407391A (en) Full text indexing method, device, computer equipment and storage medium of database
KR101646954B1 (en) Database apparatus, database management methof performing in database apparatus and storage media storing the same
US20200342065A1 (en) Replicating user created snapshots
CN113312351A (en) Data processing method and device
CN115587090A (en) Data storage method, device, equipment and medium based on Doris

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant