CN111352936A - Method and storage medium for ES index reconstruction - Google Patents

Method and storage medium for ES index reconstruction Download PDF

Info

Publication number
CN111352936A
CN111352936A CN202010081576.0A CN202010081576A CN111352936A CN 111352936 A CN111352936 A CN 111352936A CN 202010081576 A CN202010081576 A CN 202010081576A CN 111352936 A CN111352936 A CN 111352936A
Authority
CN
China
Prior art keywords
index
consumption
old
consumption group
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010081576.0A
Other languages
Chinese (zh)
Inventor
刘德建
林伟
郭玉湖
陈宏�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Tianquan Educational Technology Ltd
Original Assignee
Fujian Tianquan Educational Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Tianquan Educational Technology Ltd filed Critical Fujian Tianquan Educational Technology Ltd
Priority to CN202010081576.0A priority Critical patent/CN111352936A/en
Publication of CN111352936A publication Critical patent/CN111352936A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures

Abstract

The invention provides a method and a storage medium for reconstructing an index by an ES (ES), wherein the method comprises the following steps: creating a consumption group thread waiting for triggering consumption; synchronously writing the index operation corresponding to the old index into the consumption group; configuring setting and mapping fields of the new index; reconstructing an index; triggering the consumption group to consume, and consuming the index operation in the consumption group into the newly-built index; when the consumption delay of the consumption group is lower than a threshold value, the association of the old index and the alias thereof is switched to the association of the new index and the alias. The invention can realize the switching of ES indexes without stopping service, and bring about nearly non-sensible experience to users; meanwhile, the invention also has the characteristics of high efficiency, stability and low cost.

Description

Method and storage medium for ES index reconstruction
Technical Field
The invention relates to the field of database search, in particular to a method and a storage medium for ES reconstruction index.
Background
With the rapid development of the mobile internet, a business system faces a scene of complex searching of big data, and the traditional relational database MySQL cannot be applied to the scene of complex condition searching of the big data.
The Elasticissearch is a distributed full-text search engine based on the Lucene underlying technology, and provides a near-real-time solution for complex search conditions. The specific principle comprises the following steps: firstly, a user submits data to an elastic search database, then a corresponding sentence is segmented by a segmentation controller, the weight and the segmentation result are stored together, and by utilizing the principle of inverted index, when the user searches data, the data is ranked and scored according to the weight result, and the search result is returned to the user. The data in the Elasticsearch is stored in the index, and each index generally needs to be preset with setting of the index and a mapping type corresponding to the field. However, the field map type of the elasticsearch index can only add a field once created, and cannot change an existing field. In an actual online business scenario, situations are often encountered in which the field type setting is incorrect, and dirty data exists in the index of the online elastic search. When processing these scenes, an operation of reconstructing the index has to be performed. In order to affect the service as little as possible, the existing technical solution is to reconstruct the index by reconstructing the index and switching the alias, and then appending data at a later stage, and the specific method is as follows: (1) establishing a new index mapping type and related settings; (2) stopping inserting or modifying and deleting data into the old index; (3) copying the data of the old index into the new index through the index re-index operation; (4) deleting the association with the old index by the alias, and associating the alias to the new index; (5) data is added. However, in this method, from the second step, the data of the old index is not updated any more. Therefore, the old data is searched by the user, and the user can not search the latest data until the fifth step operation is completed. This may take several minutes if the amount of old index data is not large, but in the case of hundreds G, even T of old index data, the user may wait several hours to query the latest data, which is intolerable to the C-side search service with high real-time requirement in the internet.
Therefore, it is necessary to provide an effective solution to the problem that the user may not search the latest data for a long time due to the reconstruction of the index, which brings a bad experience to the user.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the ES index reconstruction method and the storage medium are provided, and the operating process is insensitive to users, so that the user experience is remarkably improved.
In order to solve the technical problems, the invention adopts the technical scheme that:
creating a consumption group thread waiting for triggering consumption;
after the setting and mapping fields of the new index are configured, the index operation corresponding to the old index is synchronously written into the consumption group;
reconstructing an index;
triggering the consumption group to consume, and consuming the index operation in the consumption group into the newly-built index;
when the consumption delay of the consumption group is lower than a threshold value, the association of the old index and the alias thereof is switched to the association of the new index and the alias.
The invention provides another technical scheme as follows:
a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, is capable of implementing the steps included in the above-mentioned method for ES reconstruction index.
The invention has the beneficial effects that: aiming at the problem that the index base can not be used by a user for a long time due to index reconstruction, the invention realizes the index reconstruction without stopping the normal use of the old index by utilizing the message queue double consumption group mode based on the premise of message queue power consumption, and brings nearly-insensible experience to the user.
Drawings
FIG. 1 is a flowchart illustrating a method for reconstructing an index by an ES according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an ES index reconstruction method according to an embodiment of the present invention.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
The most key concept of the invention is as follows: the use of the old index is not stopped, the index operation of the old index is monitored through the double consumption groups, and after the new index is rebuilt, the data is added through the consumption groups, so that the user is provided with nearly-insensible experience.
The technical terms related to the invention are explained as follows:
Figure BDA0002380499620000031
referring to fig. 1, the present invention provides a method for reconstructing an ES index, including:
creating a consumption group thread waiting for triggering consumption;
synchronously writing the index operation corresponding to the old index into the consumption group;
configuring setting and mapping fields of the new index;
reconstructing an index;
triggering the consumption group to consume, and consuming the index operation in the consumption group into the newly-built index;
when the consumption delay of the consumption group is lower than a threshold value, the association of the old index and the alias thereof is switched to the association of the new index and the alias.
From the above description, the beneficial effects of the present invention are: creating a new consumption group to monitor an index operation event of an old index, and reconstructing the index through a reindex without stopping inserting data into the old index or modifying or deleting data (therefore, a user can still inquire the latest data) after establishing mapping of a new index meeting the requirement; and after reconstruction is completed, the index operation in the new consumption group is consumed to the new index through the new consumption group to realize data addition, and the alias switching is carried out only after complete addition is successful. The above operation is not sensible to the user.
Further, when the consumption delay of the consumption group is lower than a threshold, the method further comprises the following steps:
stopping consumption behavior of the consumption group;
deleting the consumption group and old index.
As can be seen from the above description, after the new index is successfully reconstructed and can be put into use, the consumption group and the old index are deleted, so that unnecessary resource loss can be avoided.
Further, the index operation corresponding to the old index is synchronously written into the consumption group, specifically:
receiving an index operation instruction corresponding to the old index;
and writing the index operation to an old index according to the instruction, and simultaneously writing the index operation to the consumption group.
As can be seen from the above description, the old index will keep working normally in the process of rebuilding the new index and before the new index is put into use, so as to provide good experience for the user.
Further, after configuring the setting and mapping fields of the new index, the method further includes:
and configuring parameters of the new index, wherein the parameters correspond to the dirty data in the corrected index.
As can be seen from the above description, in the process of reconstructing the new index, the dirty data in the old index can be corrected through the parameter configuration of the new index.
Further, the reconstructing the index specifically includes:
copying the data of the old index to the new index.
As can be seen from the above description, the old index function is maintained by moving the old index data as it is.
Further, the consumption group is a message queue of a kafka topic or a rabbitmq topic or an actvemq topic.
As can be seen from the above description, it is more flexible to support multiple types of message queues to be selected.
The invention provides another technical scheme as follows:
a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing the steps of a method of ES reconstruction indexing comprising:
creating a consumption group thread waiting for triggering consumption;
synchronously writing the index operation corresponding to the old index into the consumption group;
configuring setting and mapping fields of the new index;
reconstructing an index;
triggering the consumption group to consume, and consuming the index operation in the consumption group into the newly-built index;
when the consumption delay of the consumption group is lower than a threshold value, the association of the old index and the alias thereof is switched to the association of the new index and the alias.
Further, when the consumption delay of the consumption group is lower than a threshold, the method further comprises the following steps:
stopping consumption behavior of the consumption group;
deleting the consumption group and old index.
Further, the index operation corresponding to the old index is synchronously written into the consumption group, specifically:
receiving an index operation instruction corresponding to the old index;
and writing the index operation to an old index according to the instruction, and simultaneously writing the index operation to the consumption group.
Further, after configuring the setting and mapping fields of the new index, the method further includes:
and configuring parameters of the new index, wherein the parameters correspond to the dirty data in the corrected index.
Further, the reconstructing the index specifically includes:
copying the data of the old index to the new index.
Further, the consumption group is a message queue of a kafka topic or a rabbitmq topic or an actvemq topic.
As can be understood from the above description, those skilled in the art can understand that all or part of the processes in the above technical solutions can be implemented by instructing related hardware through a computer program, where the program can be stored in a computer-readable storage medium, and when executed, the program can include the processes of the above methods. The program can also achieve advantageous effects corresponding to the respective methods after being executed by a processor.
The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Example one
Referring to fig. 2, the present embodiment provides a method for reconstructing an ES index, which brings a user experience of reconstructing an index without sense.
According to the method, the index operation of the old index is recorded through the message queue, the queue message is consumed to the new index for data addition after the new index is reconstructed, and the old index does not need to be stopped in the whole process.
The message queue related to the embodiment is a double consumption group, and can be a kafka topic, a rabbitmq topic or an activemq topic. In this example, the kafka theme is explained.
The method comprises the following steps:
s1: creating a consumption group thread waiting for triggering consumption;
specifically, a thread of a consumption group new _ group corresponding to the kafka theme is created, and the consumption mode of the consumption group is that the message is consumed only after being triggered.
S2: synchronously writing the index operation corresponding to the old index into the consumption group;
after the consumption group thread is created, all index operations of the old index by the user are written into the consumption group together, but the consumption is not carried out.
It should be noted that the old index does not stop working from the beginning to the end. That is to say, in the present embodiment, the ES reconstructs the index in the whole process until the old index is deleted, and when the user searches, the user still executes the index based on the old index, and also supports all the index operations, such as inserting, modifying, and deleting data, to be executed in the old index.
Corresponding to the step, the user is corresponding to the index operation of the old index and is also written into the consumption group.
S3: creating a new index new _ index;
specifically, in the database where the old index old _ index to be modified is located, the mapping field mappings required for creating the new index new _ index and the corresponding settings of other related indexes are set.
In a specific example, the method further comprises the step of setting the parameter configuration of the new index according to specific business requirements, such as writing dirty data to be corrected in a script.
S4: and reconstructing the index.
Specifically, a reindex rebuilding index operation is performed, and at this time, the elastic search automatically copies the data of the old index into the new index.
S5: after the step S4 is completed, the consumption group created in the first step is triggered to perform consumption information, and new index data is added.
That is, from this point on, writing of the index operation recorded in the consumption group into the new index is performed.
S6: monitoring the consumption delay lag of the consumption group new _ group, and executing the next step when the consumption delay lag is lower than a preset threshold value, namely the consumption delay lag is stabilized at a small value and indicates that the latest generated data can be consumed in near real time;
s7: switching alias associations;
specifically, the alias is switched by an atom built in the elastic search. The association of the alias with the old index is deleted and then the alias is associated with the new index.
S8: stopping consumption behavior of the consumption group;
s9: deleting the consumption group and old index.
Example two
This embodiment provides a specific application scenario corresponding to the first embodiment:
the topic for Kafka is: topic _ order, the corresponding consumption group is: consumer _ order _ group;
the old order index name is: index _ order, the corresponding alias is: alias _ order;
the new order index name is: index _ order _ new.
The method comprises the following steps:
1. the old order program inquires the old order index _ order according to the alias _ order and writes order data; and simultaneously, the newly added order data is also written into a newly-built consumption group consumer _ order _ group waiting for triggering consumption.
That is, after modification, the old order program can query the index _ order according to the alias _ order to write the order data. At this point, the order program also needs to write the piece of order data together into the topic _ order topic of kafka.
2. An operation of reconstructing an index in the ES is performed.
Parameters of the reconstruction index are set, such as: dirty data and the like which need to be corrected can be written in the script, and the index rebuilding operation is carried out. The elastic search copies data of the old index _ order to the new index _ order _ new.
At this time, the old order program also performs the step 1 operation. Meanwhile, the order program is written into and inquired of the old index _ order index base, and the operation of the program is not influenced.
3. When step 2 is completed, that is, index _ order _ new is established and the old index data is copied into the new index library, the consumer _ order _ group consumption group thread is started, and from this moment, the index message in topic _ order is incrementally consumed into the new index library index _ order _ new.
4. Observing the consumption delay lag of the consumption group consumer _ order _ group _ new, when lag is smaller, the latest production data can be consumed in near real time. At this time, the next operation can be performed.
5. And switching the alias association.
Alias switching by an atom built in the elasticsearch: i.e. the alias _ order is deleted from the association with the old index _ order while associating the alias _ order to the new index _ order _ new. The process can perform the switching process quickly.
6. Stopping consuming the consumer _ order _ group;
7. the consumption group consumr _ order _ group and index _ order are deleted.
At this time, the function of switching the ES index without stopping the service has been realized.
EXAMPLE III
This embodiment provides another specific application scenario corresponding to the first embodiment:
service scenario (order):
in a large e-commerce platform, order data of a user is often stored in an order index order in an elastic search, and the data of the order data can reach billions of data, and the disk space is occupied by 2 to 3T. By utilizing the distributed search feature of the elastic search, a user can search for his or her own order among billions of orders in milliseconds.
Service requirements are as follows:
some users of online orders feed back that the orders cannot be searched out.
The technical scheme is as follows:
because field mapping of the online order index is provided with a problem, the user cannot search out the correct order. While the online order amount already has billions of data, the conventional scheme of reconstructing the index may result in the user not searching for the latest order within hours. Therefore, a method for reconstructing the index of the new elastic search is adopted.
Basic information:
the topic for Kafka is: topic _ order;
the order index name is: the alias corresponding to index _ order is: alias _ order;
the corresponding consumption groups are: consumer _ order _ group.
The method comprises the following specific steps:
1. creating a new consumption group consumer _ order _ group _ new corresponding to the kafka theme topic _ order, wherein the consumption offset is latest (namely only monitoring the messages after the moment is started, and no message consumption is carried out); the snooping object of the consumption group is an index operation corresponding to the old index, namely an index operation synchronously writing the old index.
2. Modifying the field attribute of the order index according to the requirement, and creating a new order index _ order _ new;
3. parameters of the reconstruction index are set, such as: dirty data and the like needing to be corrected can be written in the script, a reindex rebuilding index operation is carried out, and the elastic search can copy the data of the old index _ order to the new index _ order _ new.
4. After the step 3 is completed, the consumption group consumer _ order _ group _ new established in the first step is consumed, and new index data is added.
5. Observing the consumption delay lag of the consumption group consumer _ order _ group _ new, when lag is smaller, the latest production data can be consumed in near real time. At this time, the next operation can be performed.
6. And switching the alias association. Alias switching by an atom built in the elasticsearch: i.e. the alias _ order is deleted from the association with the old index _ order while associating the alias _ order to the new index _ order _ new.
7. Stopping consuming the consumer _ order _ group;
8. the consumption group consumr _ order _ group and index _ order are deleted.
To summarize: after the operation, the problem that some users cannot search the order is solved, and the condition that the users can inquire the latest order is not influenced.
Example four
Corresponding to the first to fourth embodiments, the present embodiment provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, is capable of implementing the steps included in the method for reconstructing an index of an ES according to any one of the first to fourth embodiments. The detailed steps are not repeated here, and please refer to the description of the first to fourth embodiments in detail.
In summary, the method and the storage medium for reconstructing the ES index provided by the present invention can implement ES index switching without stopping service, and bring approximately non-sensible experience to the user; meanwhile, the invention also has the characteristics of high efficiency, stability and low cost.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims (7)

  1. A method of ES reconstruction indexing, comprising:
    creating a consumption group thread waiting for triggering consumption;
    synchronously writing the index operation corresponding to the old index into the consumption group;
    configuring setting and mapping fields of the new index;
    reconstructing an index;
    triggering the consumption group to consume, and consuming the index operation in the consumption group into the newly-built index;
    when the consumption delay of the consumption group is lower than a threshold value, the association of the old index and the alias thereof is switched to the association of the new index and the alias.
  2. 2. The ES index reconstruction method of claim 1, wherein when the consumption delay of the consumption group is below a threshold, then further comprising:
    stopping consumption behavior of the consumption group;
    deleting the consumption group and old index.
  3. 3. The ES index rebuilding method of claim 1, wherein said index operation corresponding to the old index is synchronously written into said consumption group, specifically:
    receiving an index operation instruction corresponding to the old index;
    and writing the index operation to an old index according to the instruction, and simultaneously writing the index operation to the consumption group.
  4. 4. The method for ES rebuilding index of claim 1, wherein after configuring the setting and mapping fields of the new index, further comprising:
    and configuring parameters of the new index, wherein the parameters correspond to the dirty data in the corrected index.
  5. 5. The ES index reconstruction method according to claim 1, wherein the index reconstruction method specifically comprises:
    copying the data of the old index to the new index.
  6. 6. The ES re-indexing method of claim 1, wherein the consumption group is a message queue of a kafka topic or a rabbitmq topic or an actvemq topic.
  7. 7. A computer-readable storage medium, on which a computer program is stored, the program being capable of implementing the steps included in the method for ES reconstruction index according to any one of claims 1 to 6 when the program is executed by a processor.
CN202010081576.0A 2020-02-06 2020-02-06 Method and storage medium for ES index reconstruction Pending CN111352936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010081576.0A CN111352936A (en) 2020-02-06 2020-02-06 Method and storage medium for ES index reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010081576.0A CN111352936A (en) 2020-02-06 2020-02-06 Method and storage medium for ES index reconstruction

Publications (1)

Publication Number Publication Date
CN111352936A true CN111352936A (en) 2020-06-30

Family

ID=71196972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010081576.0A Pending CN111352936A (en) 2020-02-06 2020-02-06 Method and storage medium for ES index reconstruction

Country Status (1)

Country Link
CN (1) CN111352936A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507187A (en) * 2020-11-11 2021-03-16 贝壳技术有限公司 Index changing method and device
CN112835980A (en) * 2021-02-05 2021-05-25 北京字跳网络技术有限公司 Index reconstruction method, device, equipment, computer readable storage medium and product
CN115495634A (en) * 2022-11-17 2022-12-20 北京滴普科技有限公司 Method and system for capturing change data based on Elasticissearch plug-in

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170371018A1 (en) * 2016-06-22 2017-12-28 Comsats Institute Of Information Technology Fpga implementation of a real-time parallel mri reconstruction
CN110609865A (en) * 2018-05-29 2019-12-24 优信拍(北京)信息科技有限公司 Information synchronization method, device and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170371018A1 (en) * 2016-06-22 2017-12-28 Comsats Institute Of Information Technology Fpga implementation of a real-time parallel mri reconstruction
CN110609865A (en) * 2018-05-29 2019-12-24 优信拍(北京)信息科技有限公司 Information synchronization method, device and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEIXIN_33972649: "ElasticSearch数据同步与无缝迁移", 《HTTPS://BLOG.CSDN.NET/WEIXIN_33972649/ARTICLE/DETAILS/89587933》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507187A (en) * 2020-11-11 2021-03-16 贝壳技术有限公司 Index changing method and device
CN112835980A (en) * 2021-02-05 2021-05-25 北京字跳网络技术有限公司 Index reconstruction method, device, equipment, computer readable storage medium and product
CN112835980B (en) * 2021-02-05 2024-04-16 北京字跳网络技术有限公司 Index reconstruction method, device, equipment, computer readable storage medium and product
CN115495634A (en) * 2022-11-17 2022-12-20 北京滴普科技有限公司 Method and system for capturing change data based on Elasticissearch plug-in

Similar Documents

Publication Publication Date Title
US10831779B2 (en) Seamless data migration across databases
CN107391653B (en) Distributed NewSQL database system and picture data storage method
US9830372B2 (en) Scalable coordination aware static partitioning for database replication
US8688936B2 (en) Point-in-time copies in a cascade using maps and fdisks
US7533136B2 (en) Efficient implementation of multiple work areas in a file system like repository that supports file versioning
US5903898A (en) Method and apparatus for user selectable logging
CN111352936A (en) Method and storage medium for ES index reconstruction
CN110188114B (en) Data operation optimization method, device, system, equipment and storage medium
US20100306238A1 (en) Parallel segmented index supporting incremental document and term indexing
CN107665219B (en) Log management method and device
CN108369588B (en) Database level automatic storage management
US20230342353A1 (en) Targeted sweep method for key-value data storage
CN109885642B (en) Hierarchical storage method and device for full-text retrieval
WO2017003971A1 (en) Metamorphic documents
JP2004524632A (en) System and method for reorganizing stored data
WO2017156855A1 (en) Database systems with re-ordered replicas and methods of accessing and backing up databases
US8407242B2 (en) Temporal binding for semantic queries
US7801921B2 (en) Deletion of data from child tables with multiple parents
CN114969165B (en) Data query request processing method, device, equipment and storage medium
EP2590089B1 (en) Rule type columns in database
US11467777B1 (en) Method and system for storing data in portable storage devices
CN112115115B (en) File moving method, equipment and storage medium based on data warehouse
US11487784B2 (en) Reload procedure to retain data in target system
US11966637B1 (en) Method and system for storing data in portable storage devices
CN113204520B (en) Remote sensing data rapid concurrent read-write method based on distributed file system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination