WO2015161550A1 - Index management method and device, and computer storage medium - Google Patents

Index management method and device, and computer storage medium Download PDF

Info

Publication number
WO2015161550A1
WO2015161550A1 PCT/CN2014/079517 CN2014079517W WO2015161550A1 WO 2015161550 A1 WO2015161550 A1 WO 2015161550A1 CN 2014079517 W CN2014079517 W CN 2014079517W WO 2015161550 A1 WO2015161550 A1 WO 2015161550A1
Authority
WO
WIPO (PCT)
Prior art keywords
index
data
index data
command
management
Prior art date
Application number
PCT/CN2014/079517
Other languages
French (fr)
Chinese (zh)
Inventor
谢东
喻红宇
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015161550A1 publication Critical patent/WO2015161550A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to database technology, and more particularly to a method, apparatus, and computer storage medium for managing an index. Background technique
  • a database is a data processing device that has been developed to meet the needs of data processing.
  • the database system sprouted in 1960.
  • 1970 the concept of relational model of database was proposed.
  • a relational database was formed.
  • an index is a data structure that sorts the values of one or more columns in a database table, allowing the corresponding structured query language (SQL, Structured Query Language) statements to execute faster.
  • SQL structured query language
  • Indexes are written by application developers and are commonly used in database development. The maintenance of the index is a very important task because the database system is automatically completed.
  • the index maintenance process consumes a lot of system resources
  • Embodiments of the present invention provide a method, an apparatus, and a computer storage medium for managing an index, which can reduce system resource consumption.
  • An embodiment of the present invention provides a method for managing an index, including:
  • the corresponding index data is obtained from the existing index data. If not obtained, the data is read from the database table, and the corresponding index data is obtained by calculating the sorting;
  • the obtaining the corresponding index data from the existing index data includes: determining, by analyzing a calculated similarity of the multiple indexes to be merged, and a range of the index data corresponding to the multiple indexes to be merged, determining Index data for the new index.
  • the performing the management operation on the obtained index data according to the management index command includes:
  • the obtained index data is merged, and the repeated index data in the merged index data is eliminated, and the plurality of indexes to be merged are merged into one new index.
  • the obtaining the corresponding index data from the existing index data includes: acquiring index data corresponding to the index to be split from the existing index data;
  • the performing the management operation on the obtained index data according to the management index command includes:
  • the management index command includes any one of the following:
  • An embodiment of the present invention further provides an apparatus for managing an index, including:
  • the obtaining module is configured to obtain the corresponding index data from the existing index data after receiving the management index command, and if not obtained, read the data from the database table, and sort by calculation After obtaining the corresponding index data;
  • the management module is configured to perform management operations on the obtained index data according to the management index command.
  • the acquiring module is further configured to: after receiving the merge index command, determine a new index by analyzing a calculation similarity of the multiple indexes to be merged, and a range of index data corresponding to the multiple indexes to be merged. Index data.
  • the management module is further configured to merge the acquired index data, eliminate duplicate index data in the merged index data, and merge the multiple indexes into one new index.
  • the acquiring module is further configured to: after receiving the split index command, obtain index data corresponding to the index to be split from the existing index data.
  • the management module is further configured to divide the index data into a plurality of parts according to the specified splitting method, and split the index to be split into multiple indexes, for example, the index data range is overlapped between the split indexes. , copy the duplicate index data.
  • the received management index command includes any one of the following: an index creation command, a modification index definition command, an insert data command, an update data command, a merge index command, and a split index command.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method for managing the index described above.
  • the problem that the index data is not fully utilized and the index maintenance consumes a large system resource can be solved in the current database management index process.
  • the technology of obtaining index data from existing index data is especially suitable for adding selection conditions in index definition statements and managing big data lifecycle.
  • FIG. 1 is a flowchart of a method for managing an index according to an embodiment of the present invention
  • 2 is a flowchart of a method for creating a new index according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for modifying an index definition in an embodiment of the present invention
  • FIG. 4 is a flowchart of a method for inserting data management index data according to an embodiment of the present invention
  • FIG. 5 is a flowchart of a method for updating data management index data according to an embodiment of the present invention
  • FIG. 6 is a flowchart of combining multiple index methods according to an embodiment of the present invention
  • FIG. 7 is a flowchart of a method for splitting an index into multiple indexes according to an embodiment of the present invention
  • FIG. 8 is a schematic diagram of an apparatus for managing an index according to an embodiment of the present invention. Implementation method
  • the inventors have found that in the database management indexing process of the related art, the index data is obtained by calculating the table data.
  • index data is obtained by calculating the table data.
  • the database begins to maintain index data. This action may be triggered by the user sending a create command, or it may be caused by a user addition, deletion, or modification.
  • the database reads the corresponding data content from the table, calculates the index data, includes pointers of the index column data values, and sorts according to the specified order.
  • the calculated index data is written to the index according to the order specified by the index definition. If it is an update operation, you need to delete the old index data at the same time.
  • the index maintenance process consumes a lot of system resources
  • indexing becomes more and more difficult as table data continues to expand and indexes become larger and larger.
  • index maintenance process a large amount of system resources are consumed, mainly in the following: In the stage of reading the table data, the system input and output (10) resources are mainly consumed; in the calculation of the index data and the sorting stage, the central processing unit (CPU) resources are mainly consumed. .
  • the amount of table data may reach the order of TB, PB, and ZB, and it takes a long time to create a new index.
  • Indexes are created based on the actual needs of the application. Typically, there are multiple indexes on the same table. Some indexes may have the same or similar index data and should be used by each other during index maintenance.
  • the relational database has a solid mathematical theoretical basis, and the index data is very reliable. Unless extreme conditions such as natural disaster hardware damage occur, the index data is hard to be damaged. Under the relevant technical conditions, in the database management index process, in order to obtain the index data, the existing index data is not fully utilized, but the index data is recalculated according to the table data each time, and some data of the same field is repeatedly Repeated calculation.
  • the embodiment of the present invention is divided into two stages: first, the index data is obtained from the existing index data; if not found, the data is read from the table (ie, the database table), and the sort is calculated. Get the index data.
  • the syntax for creating an index may be create index idxt log l on t-log (callno).
  • the syntax of the selection condition of the force P can be written as create index.
  • index data is very flexible, database application developers may create rich indexes to meet actual needs, manage index data It is especially important.
  • index data is obtained from existing index data, not only can the results be obtained quickly, but also the system resources can be reduced.
  • Database is a warehouse that organizes, stores, and manages data according to its data structure.
  • databases There are many types of databases, from the simplest tables for storing various data to large database systems capable of massive data storage.
  • a relational database is a database based on a relational database model that processes data in a database by means of concepts and methods such as collection algebra.
  • IBM researcher Dr. Edgar Frank Cod proposed the concept of a relational model of the database and laid the theoretical foundation for the relational model.
  • the relational database has a solid mathematical theoretical foundation and is widely used in various industries with the development of information technology and market.
  • Big data not only contains “massive data”, but also contains complex types of data. Big data includes all data sets, including transactional and interactive datasets, that are larger or more complex than the ability of common technologies to capture, manage, and process these data sets at reasonable cost and timelines. The big data concept is actually an effective use of massive data, and the data size and transfer speed are quite high.
  • index in a relational database, an index is a data structure that sorts the values of one or more columns in a database table. The index provides pointers to the values of these column data, sorted according to the specified order. Indexing can make the corresponding SQL statement execute faster.
  • the role of the index is equivalent to the book's directory. You can quickly find the content you need based on the page number in the directory.
  • index maintenance tasks include: creating new indexes, updating index data, and deleting indexes.
  • FIG. 1 is a flowchart of a method for managing an index according to an embodiment of the present invention. As shown in FIG. 1, the method in this embodiment includes the following steps:
  • Index maintenance consumes a small amount of system resources: The embodiment of the present invention obtains index data from existing index data, and consumes less 10 and CPU than the method for calculating index data according to the table data in the related art, and consumes less system resources. , index maintenance time is shorter.
  • SQL statement execution efficiency The actual situation is that a table often has multiple indexes. When executing a SQL statement on a table, if it is to add/modify data operations, it is necessary to add/modify index data; if multiple indexes require the same index data, the related art processing is: If there is no required index, create Index, repeats the sorting of the same index data multiple times. After the method of the embodiment of the present invention is used, only the first calculation and sorting are needed, and then other indexes can be obtained by copying. The same SQL statement consumes less CPU and 10, and SQL statement execution is more efficient.
  • Step S110 The database receives a command to create an index by the user.
  • index type index type
  • name table name
  • field name etc.
  • description of the selection condition which may be a where condition.
  • Step S120 Analyze whether it is necessary to recalculate and sort, if necessary, execute step S130; otherwise, execute step S140.
  • the database analyzes the current status, including: which indexes are on the table, which data ranges each index has, and the index data ranges that need to be created; whether the index data can be used currently, and if so, the recalculation and sorting are not required. Step S140; If no, proceed to step S130 to recalculate and sort.
  • Step S130 calculating index data.
  • Step S140 finding the pointer position on the index and writing the index data.
  • step S150 it is judged whether or not it is finished. If there is still data to be processed, the process proceeds to step S120, and the remaining data is processed until the process is completed. Step S160, ending.
  • FIG. 3 is a flowchart of an implementation of modifying an index definition according to an embodiment of the present invention.
  • the technical solution adopted by the embodiment of the present invention is: determining whether an existing index includes required index data, and if so, preferentially obtaining data required from existing index data.
  • the process of modifying the index definition includes the following steps:
  • Step S210 The database receives the modification command of the index issued by the user, and needs to maintain the index data of the index.
  • Step S220 Analyze whether it is necessary to recalculate and sort, if necessary, execute step S230; otherwise, execute step S240.
  • the database analyzes the current status, including: which indexes on the table (including the index defined by the modification), what data ranges each index has, and what is the range of index data that needs to be created; and then determines whether there is currently index data available, if , indicating that no recalculation and sorting are required, step S240 is performed; if no, step S230 is performed to recalculate and sort.
  • Step S230 Calculate index data.
  • Step S240 find the location, and update the index.
  • Step S250 Determine whether there is still data to be processed, and if necessary, return to step S220 to process the remaining data until the processing is completed; otherwise, execute step S260.
  • Step S260 ending.
  • Step S310 The database receives the insert data command issued by the user, and needs to maintain the table index data.
  • Step S320 Analyze whether it is necessary to recalculate and sort, if necessary, execute step S330; otherwise, execute step S340.
  • step S340 For an index, if it needs to maintain its index data, analyze the current situation, whether the required index data has been calculated and sorted, if yes, it indicates that the index data can be directly used, proceed to step S340; if not, proceed to step S330 to Perform calculations and sorting.
  • Step S330 calculating index data.
  • the current inserted data is calculated and sorted to obtain index data.
  • Step S340 Find a pointer position on the index, and write index data.
  • Step S350 determining whether there is an index on the table for maintenance, if it is necessary, returning to step S320; if not, proceeding to step S360.
  • FIG. 5 is a flowchart of an implementation of updating data management index data in an embodiment of the present invention.
  • the technical solution adopted by the embodiment of the present invention is as follows: If multiple indexes require the same index data, only sorting is performed once, and other indexes do not need to be recalculated and Sorting, as shown in Figure 5, the process of updating the data management index data includes the following steps:
  • Step S410 The database receives the update data command sent by the user, and needs to maintain the index data corresponding to the table.
  • Step S420 analyzing whether it is necessary to recalculate and sort, if necessary, executing step S430; otherwise, executing step S440.
  • Step S430 Calculate the current update data according to the index definition method, obtain new index data, and then go to step S440.
  • Step S440 find the location, and write the index.
  • index data has been sorted and consistent, you can delete it and then re-write it, or you can not delete or re-write it.
  • Step S450 determining whether there is an index on the table for maintenance, if yes, proceeding to step S420; if not, executing step S460.
  • FIG. 6 is a flowchart of an implementation of combining multiple indexes according to an embodiment of the present invention.
  • the technical solution adopted by the embodiment of the present invention is as follows: obtaining a new by analyzing the computational similarity of multiple indexes to be merged and the range of corresponding index data. Indexing the index data, and then merging the obtained index data, culling the repeated index data, and merging the plurality of indexes into a new index.
  • the process of merging the multiple indexes includes the following steps:
  • Step S510 The database receives the merge index command issued by the user, and combines the multiple indexes into one index command, and the index data should have the same or similar calculation method.
  • having an index includes the same field, and the calculation methods are the same or similar.
  • Step S520 Combine the index data.
  • the method of merging may be to modify the linked list pointers, and connect the multiple indexes at the beginning and the end; then, by analyzing the data range of each index, the duplicate index data is culled.
  • Step S530 updating the system table. Modify the database data dictionary, delete the previous multiple index information, and insert new index information. Step S540, ending.
  • FIG. 7 is a flowchart of an implementation of splitting an index into multiple indexes according to an embodiment of the present invention.
  • the technical solution adopted by the embodiment of the present invention is: obtaining index data corresponding to an index to be split from existing index data, and then specifying The splitting method divides the index data into multiple parts, and splits the to-be-split index into multiple indexes. For example, if there is a duplicate index data range between the split indexes, the duplicate index data is copied, as shown in FIG.
  • the process of splitting an index into multiple indexes includes the following steps:
  • Step S610 The database receives a split index command sent by the user, and splits an index into multiple index commands.
  • the splitting method can be divided according to the data range (for example: time field), and the sub-index can be kept in the original calculation method, and the index data ranges can be mutually exclusive.
  • Step S620 splitting the index data.
  • the split method may be to modify the linked list pointer and interrupt the linked list; if the sub-index range is repeated, Need to copy the duplicate index data once.
  • Step S630 updating the system table.
  • FIG. 8 is a schematic diagram of an apparatus for managing an index according to an embodiment of the present invention.
  • the apparatus may run the database as described above.
  • the apparatus for managing an index includes:
  • the obtaining module 81 is configured to: after receiving the management index command, obtain the corresponding index data from the existing index data, and if not obtained, read the data from the database table, and obtain the corresponding index data by calculating the sorting;
  • the management module 82 is configured to perform the obtained index data according to the management index command. Manage operations.
  • the acquiring module 81 may be further configured to: after receiving the merge index command, analyze the calculated similarity of the multiple indexes to be merged, and the range of the index data corresponding to the multiple indexes to be merged , obtaining index data of the new index;
  • the management module 82 may be further configured to merge the acquired index data, cull the duplicate index data in the merged index data, and merge the multiple indexes into one new index.
  • the acquiring module 81 may be configured to: after receiving the split index command, obtain index data corresponding to the index to be split from the existing index data;
  • the management module 82 may be configured to divide the index data into a plurality of parts according to the specified splitting method, and split the index to be split into multiple indexes, for example, the index data ranges are overlapped between the split indexes. Then copy the duplicate index data.
  • the management index command received by the obtaining module 81 may include any one of the following: an index creation command, a modification index definition command, an insert data command, an update data command, a merge index command, and a split index command.
  • the obtaining module 81 and the management module 82 may be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA) of a device that manages the index. achieve.
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the embodiment of the invention further describes a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to perform the alarm processing priority determination method shown in FIG.
  • embodiments of the invention may be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware aspects. Moreover, the invention can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage and optical storage, etc.) in which computer usable program code is embodied.
  • the present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine that causes configuration of instructions executed by a processor of a computer or other programmable data processing device Means for implementing the functions specified in a block or blocks of a flow or a flow and/or a block diagram of a flow chart.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps that are configured to implement the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.

Abstract

Provided are an index management method and device, and a computer storage medium. The method comprises: after an index management command is received, acquiring corresponding index data from the existing index data, and if the index data is not acquired, reading data from a database table, and acquiring the corresponding index data after ranking through calculation; and according to the index management command, conducting a management operation on the acquired index data.

Description

管理索引的方法、 装置及计算 *^储介质  Method, device and calculation for managing index *^ Storage medium
技术领域 Technical field
本发明涉及数据库技术, 特别是涉及一种管理索引的方法、 装置及计 算机存储介质。 背景技术  The present invention relates to database technology, and more particularly to a method, apparatus, and computer storage medium for managing an index. Background technique
数据库是为适应数据处理需要而发展起来的一种数据处理装置。 数据 库系统萌芽于 1960年, 在 1970年有人提出了数据库的关系模型的概念, 在此基础上形成了关系数据库。 随着信息技术发展, 数据已经渗透到各个 行业和应用中,关系数据库在各行各业得到了广泛应用。 在关系数据库中, 索引是对数据库表中一列或多列的值进行排序的一种数据结构, 可以使对 应的结构化查询语言( SQL, Structured Query Language )语句执行得更快。 索引由应用程序研发人员编写, 数据库研发中被普遍使用。 索引的维护管 理由数据库系统自动完成, 是一项非常重要工作。 但是, 世界已经发生了 翻天覆地的变化, 与刚提出数据库概念的时代相比, 数据特征变化艮大。 对于那些数据结构复杂, 数据量大的情况, 统称为大数据。 面对这些数据, 索引维护显得越来越困难, 成为了一个亟待解决的重要问题。  A database is a data processing device that has been developed to meet the needs of data processing. The database system sprouted in 1960. In 1970, the concept of relational model of database was proposed. On this basis, a relational database was formed. With the development of information technology, data has penetrated into various industries and applications, and relational databases have been widely used in various industries. In a relational database, an index is a data structure that sorts the values of one or more columns in a database table, allowing the corresponding structured query language (SQL, Structured Query Language) statements to execute faster. Indexes are written by application developers and are commonly used in database development. The maintenance of the index is a very important task because the database system is automatically completed. However, the world has undergone earth-shaking changes, and the data characteristics have changed greatly compared with the era when the database concept was just introduced. For those cases where the data structure is complex and the amount of data is large, it is collectively referred to as big data. Faced with these data, index maintenance becomes more and more difficult, and it has become an important issue to be solved urgently.
相关技术的管理索引方法至少存在以下缺点:  The related art management index method has at least the following disadvantages:
Α、 索引维护过程对系统资源消耗大;  Α, the index maintenance process consumes a lot of system resources;
Β、 在维护索引过程中, 已有索引数据并没有被充分利用。 发明内容  Β In the process of maintaining the index, the existing index data is not fully utilized. Summary of the invention
本发明实施例提供一种管理索引的方法、 装置及计算机存储介质, 能 够降低系统资源消耗。  Embodiments of the present invention provide a method, an apparatus, and a computer storage medium for managing an index, which can reduce system resource consumption.
本发明实施例的技术方案是这样实现的: 本发明实施例提供一种管理索引的方法, 包括: The technical solution of the embodiment of the present invention is implemented as follows: An embodiment of the present invention provides a method for managing an index, including:
接收到管理索引命令后, 从已有的索引数据中获取相应的索引数据, 如未获取到, 则从数据库表中读取数据, 通过计算排序后获取相应的索引 数据;  After receiving the management index command, the corresponding index data is obtained from the existing index data. If not obtained, the data is read from the database table, and the corresponding index data is obtained by calculating the sorting;
根据所述管理索引命令对获取到的索引数据进行管理操作。  Performing a management operation on the acquired index data according to the management index command.
优选地, 所述从已有的索引数据中获取相应的索引数据, 包括: 通过分析待合并的多个索引的计算相似性、 及所述待合并的多个索引 对应的索引数据的范围, 确定新索引的索引数据。  Preferably, the obtaining the corresponding index data from the existing index data includes: determining, by analyzing a calculated similarity of the multiple indexes to be merged, and a range of the index data corresponding to the multiple indexes to be merged, determining Index data for the new index.
优选地, 所述根据所述管理索引命令对获取到的索引数据进行管理操 作, 包括:  Preferably, the performing the management operation on the obtained index data according to the management index command includes:
对获取到的索引数据进行合并, 剔除合并的索引数据中重复的索引数 据, 将所述待合并的多个索引合并为一个新索引。  The obtained index data is merged, and the repeated index data in the merged index data is eliminated, and the plurality of indexes to be merged are merged into one new index.
优选地, 所述从已有的索引数据中获取相应的索引数据, 包括: 从已有的索引数据中获取待分裂索引对应的索引数据;  Preferably, the obtaining the corresponding index data from the existing index data includes: acquiring index data corresponding to the index to be split from the existing index data;
优选地, 所述根据所述管理索引命令对获取到的索引数据进行管理操 作, 包括:  Preferably, the performing the management operation on the obtained index data according to the management index command includes:
按照指定的分裂方法将所述待分裂索引对应的索引数据分为多个部 分, 将所述待分裂索引对应分裂为多个索引, 如分裂后的索引之间存在索 引数据范围重复, 则拷贝重复的索引数据。  Splitting the index data corresponding to the index to be split into multiple parts according to the specified splitting method, and splitting the index to be split into multiple indexes, for example, if there is a duplicate index data range between the split indexes, the copy is repeated. Index data.
优选地, 所述管理索引命令包括以下的任意一个:  Preferably, the management index command includes any one of the following:
创建索引命令; 修改索引定义命令; 插入数据命令; 更新数据命令; 合并索引命令; 分裂索引命令。  Create index commands; modify index definition commands; insert data commands; update data commands; merge index commands; split index commands.
本发明实施例还提供一种管理索引的装置, 包括:  An embodiment of the present invention further provides an apparatus for managing an index, including:
获取模块, 配置为接收到管理索引命令后, 从已有的索引数据中获取 相应的索引数据, 如未获取到, 则从数据库表中读取数据, 通过计算排序 后获取相应的索引数据; The obtaining module is configured to obtain the corresponding index data from the existing index data after receiving the management index command, and if not obtained, read the data from the database table, and sort by calculation After obtaining the corresponding index data;
管理模块, 配置为根据所述管理索引命令对获取到的索引数据进行管 理操作。  The management module is configured to perform management operations on the obtained index data according to the management index command.
优选地, 所述获取模块, 还配置为接收到合并索引命令后, 通过分析 待合并的多个索引的计算相似性、 及所述待合并的多个索引对应的索引数 据的范围, 确定新索引的索引数据。  Preferably, the acquiring module is further configured to: after receiving the merge index command, determine a new index by analyzing a calculation similarity of the multiple indexes to be merged, and a range of index data corresponding to the multiple indexes to be merged. Index data.
优选地, 所述管理模块, 还配置为对获取到的索引数据进行合并, 剔 除合并的索引数据中重复的索引数据, 将所述多个索引合并为一个新索引。  Preferably, the management module is further configured to merge the acquired index data, eliminate duplicate index data in the merged index data, and merge the multiple indexes into one new index.
优选地, 所述获取模块, 还配置为接收到分裂索引命令后, 从已有的 索引数据中获取待分裂索引对应的索引数据。  Preferably, the acquiring module is further configured to: after receiving the split index command, obtain index data corresponding to the index to be split from the existing index data.
优选地, 所述管理模块, 还配置为按照指定的分裂方法将索引数据分 为多个部分, 将所述待分裂索引对应分裂为多个索引, 如分裂后的索引之 间存在索引数据范围重复, 则拷贝重复的索引数据。  Preferably, the management module is further configured to divide the index data into a plurality of parts according to the specified splitting method, and split the index to be split into multiple indexes, for example, the index data range is overlapped between the split indexes. , copy the duplicate index data.
优选地, 所述获取模块, 接收到的管理索引命令包括以下的任意一个: 创建索引命令、 修改索引定义命令、 插入数据命令、 更新数据命令、 合并 索引命令和分裂索引命令。  Preferably, the acquiring module, the received management index command includes any one of the following: an index creation command, a modification index definition command, an insert data command, an update data command, a merge index command, and a split index command.
本发明实施例还提供一种计算机存储介质, 所述计算机存储介质中存 储有计算机可执行指令, 所述计算机可执行指令用于执行以上所述的管理 索引的方法。  The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method for managing the index described above.
综上, 本发明实施例中, 可以解决当前数据库管理索引过程中, 索引 数据没有被充分利用、 索引维护消耗系统资源大的问题。 在数据库管理索 引过程中, 从已有索引数据中获取索引数据的技术, 特别适用于索引定义 语句中增加选择条件, 对大数据生命周期管理。 附图说明  In summary, in the embodiment of the present invention, the problem that the index data is not fully utilized and the index maintenance consumes a large system resource can be solved in the current database management index process. In the process of database management indexing, the technology of obtaining index data from existing index data is especially suitable for adding selection conditions in index definition statements and managing big data lifecycle. DRAWINGS
图 1为本发明实施例的管理索引的方法的流程图; 图 2为本发明实施例中新建索引方法的流程图; FIG. 1 is a flowchart of a method for managing an index according to an embodiment of the present invention; 2 is a flowchart of a method for creating a new index according to an embodiment of the present invention;
图 3为本发明实施例中修改索引定义方法的流程图;  3 is a flowchart of a method for modifying an index definition in an embodiment of the present invention;
图 4为本发明实施例中插入数据管理索引数据方法的流程图; 图 5为本发明实施例中更新数据管理索引数据方法的流程图; 图 6为本发明实施例中合并多个索引方法的流程图;  4 is a flowchart of a method for inserting data management index data according to an embodiment of the present invention; FIG. 5 is a flowchart of a method for updating data management index data according to an embodiment of the present invention; FIG. 6 is a flowchart of combining multiple index methods according to an embodiment of the present invention; Flow chart
图 7为本发明实施例中分裂索引为多个索引的方法的流程图; 图 8为本发明实施例的管理索引的装置的示意图。 还实施方式  FIG. 7 is a flowchart of a method for splitting an index into multiple indexes according to an embodiment of the present invention; FIG. 8 is a schematic diagram of an apparatus for managing an index according to an embodiment of the present invention. Implementation method
为使本发明的目的、 技术方案和优点更加清楚明白, 下文中将结合附 图对本发明的实施例进行详细说明。 需要说明的是, 在不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互任意组合。  In order to make the objects, the technical solutions and the advantages of the present invention more comprehensible, the embodiments of the present invention will be described in detail below. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
发明人在实施本发明的过程中发现, 相关技术的数据库管理索引过程 中, 索引数据是通过计算表数据获取的。 维护索引数据的方法如下:  In the process of implementing the present invention, the inventors have found that in the database management indexing process of the related art, the index data is obtained by calculating the table data. Here's how to maintain index data:
首先, 数据库开始维护索引数据。 该动作可能是由于用户发送创建索 引命令触发, 也可能是由于用户增加、 删除、 修改操作引起。  First, the database begins to maintain index data. This action may be triggered by the user sending a create command, or it may be caused by a user addition, deletion, or modification.
然后, 数据库根据索引的定义, 从表中读取相应的数据内容, 计算出 索引数据, 包括索引列数据值的指针, 并根据指定顺序排序。  Then, according to the definition of the index, the database reads the corresponding data content from the table, calculates the index data, includes pointers of the index column data values, and sorts according to the specified order.
最后, 根据索引定义指定的顺序, 将计算好的索引数据写入索引。 如 果是更新操作, 需要同时删除老的索引数据。  Finally, the calculated index data is written to the index according to the order specified by the index definition. If it is an update operation, you need to delete the old index data at the same time.
相关技术的管理索引方法至少存在以下缺点:  The related art management index method has at least the following disadvantages:
A、 索引维护过程对系统资源消耗大;  A. The index maintenance process consumes a lot of system resources;
在大数据世代, 由于表数据在不断膨胀, 索引变得越来越大, 所以索 引维护变得越来越困难。 在索引维护过程中, 需要消耗大量系统资源, 主 要表现在: 在读取表数据阶段, 主要消耗系统输入输出 (10 ) 资源; 在计 算索引数据和排序阶段, 主要消耗中央处理器 (CPU ) 资源。 在某些大型 系统中, 表数据量可能达到 TB、 PB、 ZB数量级, 创建新索引需要花很长 时间。 In the big data generation, indexing becomes more and more difficult as table data continues to expand and indexes become larger and larger. In the index maintenance process, a large amount of system resources are consumed, mainly in the following: In the stage of reading the table data, the system input and output (10) resources are mainly consumed; in the calculation of the index data and the sorting stage, the central processing unit (CPU) resources are mainly consumed. . In some large In the system, the amount of table data may reach the order of TB, PB, and ZB, and it takes a long time to create a new index.
B、 在维护索引过程中, 已有索引数据并没有被充分利用;  B. In the process of maintaining the index, the existing index data is not fully utilized;
索引是根据应用实际需求来创建的, 通常情况下, 同一张表上有多个 索引。 某些索引可能具有相同或相近的索引数据, 在索引维护过程中应该 互相使用。 关系数据库具有坚实的数学理论基础, 索引数据可靠性非常高, 除非发生自然灾害硬件损坏等极端情况, 否则索引数据很难被损坏。 在相 关技术条件下, 在数据库管理索引过程中, 为了获取索引数据, 没有充分 利用已有索引数据, 而是每次都根据表数据来重新计算索引数据, 同一个 字段的某些数据被多次重复计算。  Indexes are created based on the actual needs of the application. Typically, there are multiple indexes on the same table. Some indexes may have the same or similar index data and should be used by each other during index maintenance. The relational database has a solid mathematical theoretical basis, and the index data is very reliable. Unless extreme conditions such as natural disaster hardware damage occur, the index data is hard to be damaged. Under the relevant technical conditions, in the database management index process, in order to obtain the index data, the existing index data is not fully utilized, but the index data is recalculated according to the table data each time, and some data of the same field is repeatedly Repeated calculation.
本发明实施例在数据库管理索引过程中, 分为两个阶段: 首先从已有 索引数据中获取索引数据; 如果找不到, 则从表(即数据库表) 中读取数 据, 并计算排序后得到索引数据。 对于索引定义语句中增加选择条件的技 术(例如: 普通的创建索引的语法可能是 create index idxt log l on t— log ( callno), 本实施例中增力 P选择条件的语法可以这样写 create index idxt log l on t— log ( callno ) where calltime>'20140101000000' ), 对大数据 生命周期管理的情况, 索引显得很灵活, 数据库应用研发人员可能会创建 丰富的索引以满足实际需要, 管理索引数据显得特别重要。 针对相关技术 存在的缺点, 通过深入研究发现: 在数据库管理索引过程中, 如果从已有 索引数据中获取索引数据, 不但能快速获得结果, 还能减小对系统资源消 耗。  In the database management index process, the embodiment of the present invention is divided into two stages: first, the index data is obtained from the existing index data; if not found, the data is read from the table (ie, the database table), and the sort is calculated. Get the index data. For the technique of adding a selection condition in the index definition statement (for example, the syntax for creating an index may be create index idxt log l on t-log (callno). In this embodiment, the syntax of the selection condition of the force P can be written as create index. Idxt log l on t— log ( callno ) where calltime> '20140101000000' ), for big data lifecycle management, the index is very flexible, database application developers may create rich indexes to meet actual needs, manage index data It is especially important. In view of the shortcomings of related technologies, it is found through in-depth research: In the process of database management indexing, if index data is obtained from existing index data, not only can the results be obtained quickly, but also the system resources can be reduced.
本发明实施例中涉及的几个名词解释如下:  Several terms involved in the embodiments of the present invention are explained as follows:
数据库( Database ), 是按照数据结构来组织、 存储和管理数据的仓库。 数据库有很多种类型, 从最简单的存储各种数据的表格, 到能够进行海量 数据存储的大型数据库系统。 关系数据库, 是建立在关系数据库模型基础上的数据库, 借助于集合 代数等概念和方法来处理数据库中的数据。 1970 年, IBM 的研究员埃德 加 ·弗兰克 ·科德博士提出了数据库的关系模型的概念, 奠定了关系模型 的理论基础。 关系数据库具有坚实的数学理论基础, 随着信息技术和市场 的发展, 在各行各业得到广泛应用。 Database is a warehouse that organizes, stores, and manages data according to its data structure. There are many types of databases, from the simplest tables for storing various data to large database systems capable of massive data storage. A relational database is a database based on a relational database model that processes data in a database by means of concepts and methods such as collection algebra. In 1970, IBM researcher Dr. Edgar Frank Cod proposed the concept of a relational model of the database and laid the theoretical foundation for the relational model. The relational database has a solid mathematical theoretical foundation and is widely used in various industries with the development of information technology and market.
大数据, 不仅包含了 "海量数据", 还包含复杂类型的数据。 大数据包 括交易和交互数据集在内的所有数据集, 其规模或复杂程度超出了常用技 术按照合理的成本和时限捕捉、 管理及处理这些数据集的能力。 大数据概 念实际上是对海量数据的有效利用, 对数据规模和转输速度要求相当高。  Big data, not only contains "massive data", but also contains complex types of data. Big data includes all data sets, including transactional and interactive datasets, that are larger or more complex than the ability of common technologies to capture, manage, and process these data sets at reasonable cost and timelines. The big data concept is actually an effective use of massive data, and the data size and transfer speed are quite high.
索引, 在关系数据库中, 索引是对数据库表中一列或多列的值进行排 序的一种数据结构。 索引提供指向这些列数据值的指针, 根据指定顺序排 序。 索引可以使对应的 SQL语句执行得更快, 索引的作用相当于图书的目 录, 可以根据目录中的页码快速找到所需的内容。 索引定义后, 其维护工 作由数据库系统自动完成。 常用的索引维护工作主要包括: 新建索引、 更 新索引数据、 删除索引。  Index, in a relational database, an index is a data structure that sorts the values of one or more columns in a database table. The index provides pointers to the values of these column data, sorted according to the specified order. Indexing can make the corresponding SQL statement execute faster. The role of the index is equivalent to the book's directory. You can quickly find the content you need based on the page number in the directory. After the index is defined, its maintenance is done automatically by the database system. Commonly used index maintenance tasks include: creating new indexes, updating index data, and deleting indexes.
图 1为本发明实施例的管理索引的方法的流程图, 如图 1所示, 本实 施例的方法包括以下步骤:  FIG. 1 is a flowchart of a method for managing an index according to an embodiment of the present invention. As shown in FIG. 1, the method in this embodiment includes the following steps:
511、 接收到管理索引命令后, 从已有的索引数据中获取相应的索引数 据, 如未获取到, 则从数据库表中读取数据, 通过计算排序后获取索引数 据。  511. After receiving the management index command, obtain corresponding index data from the existing index data. If not obtained, the data is read from the database table, and the index data is obtained by calculating the sort.
512、 根据所述管理索引命令对获取到的索引数据进行管理操作。 本发明实施例具有以下方面技术效果:  512. Perform a management operation on the obtained index data according to the management index command. The embodiments of the present invention have the following technical effects:
索引维护对系统资源消耗小: 本发明实施例从已有索引数据中获取索 引数据, 相对于相关技术中根据表数据计算索引数据的方法, 消耗更少的 10和 CPU, 对系统资源消耗更小, 索引维护时间更短。 SQL语句执行效率高: 实际的情况是, 一个表往往有多个索引。 在对 表执行 SQL语句时, 如果是增加 /修改数据操作, 需要新增 /修改索引数据; 如果多个索引需要相同的索引数据的情况, 相关技术的处理是: 如果没有 所需的索引则创建索引, 对相同的索引数据多次重复计算排序。 采用本发 明实施例的方法后, 只需要第一次计算和排序, 之后其它索引可以通过拷 贝方式获取。 相同的 SQL语句消耗 CPU和 10更少, SQL语句执行效率更 面。 Index maintenance consumes a small amount of system resources: The embodiment of the present invention obtains index data from existing index data, and consumes less 10 and CPU than the method for calculating index data according to the table data in the related art, and consumes less system resources. , index maintenance time is shorter. SQL statement execution efficiency: The actual situation is that a table often has multiple indexes. When executing a SQL statement on a table, if it is to add/modify data operations, it is necessary to add/modify index data; if multiple indexes require the same index data, the related art processing is: If there is no required index, create Index, repeats the sorting of the same index data multiple times. After the method of the embodiment of the present invention is used, only the first calculation and sorting are needed, and then other indexes can be obtained by copying. The same SQL statement consumes less CPU and 10, and SQL statement execution is more efficient.
图 2为本发明实施例中新建索引的实现流程图, 本发明实施例采用的 技术方案体现在: 判断已有索引是否包括需要的索引数据, 并优先从已有 索引数据获取需要数据, 如图 2所示, 新建索引的处理包括如下步骤: 步骤 S110、 数据库收到用户发出创建索引命令。  2 is a flowchart of an implementation of a new index in the embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is: determining whether an existing index includes required index data, and preferentially obtaining required data from the existing index data, as shown in FIG. As shown in FIG. 2, the process of creating a new index includes the following steps: Step S110: The database receives a command to create an index by the user.
创建索引命令中可以包括以下信息: 索引类型、 名称、 表名称、 字段 名称等信息, 如果是 "索引定义语句增加选择条件的技术", 还包括选择条 件的描述信息, 可能是一个 where条件。  The following information can be included in the create index command: index type, name, table name, field name, etc., if it is "the technology of the index definition statement to increase the selection condition", and also includes the description of the selection condition, which may be a where condition.
步骤 S120、 分析判断是否需要重新计算和排序, 如果需要, 则执行步 骤 S130; 否则, 执行步骤 S140。  Step S120: Analyze whether it is necessary to recalculate and sort, if necessary, execute step S130; otherwise, execute step S140.
数据库分析当前状况, 包括: 表上有哪些索引、 每个索引具有哪些数 据范围、 需要创建的索引数据范围; 判断当前是否有索引数据可以使用, 如果是, 则表明不需要重新计算和排序, 进行步骤 S140; 如果否, 则进行 步骤 S130, 以重新计算和排序。  The database analyzes the current status, including: which indexes are on the table, which data ranges each index has, and the index data ranges that need to be created; whether the index data can be used currently, and if so, the recalculation and sorting are not required. Step S140; If no, proceed to step S130 to recalculate and sort.
步骤 S130, 计算索引数据。  Step S130, calculating index data.
读取表数据, 计算排序, 获得索引数据。  Read the table data, calculate the sort, and get the index data.
步骤 S140, 找到在索引上的指针位置, 写入索引数据。  Step S140, finding the pointer position on the index and writing the index data.
步骤 S150,判断是否结束,如果还有数据要处理,继续进行步骤 S120, 处理其余数据, 直到处理完成。 步骤 S160, 结束。 In step S150, it is judged whether or not it is finished. If there is still data to be processed, the process proceeds to step S120, and the remaining data is processed until the process is completed. Step S160, ending.
图 3 为本发明实施例中修改索引定义的实现流程图, 本发明实施例采 用的技术方案体现在: 判断已有索引是否包括需要的索引数据, 如果包括 则优先从已有索引数据获取需要数据, 如图 3 所示, 修改索引定义的处理 包括如下步骤:  FIG. 3 is a flowchart of an implementation of modifying an index definition according to an embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is: determining whether an existing index includes required index data, and if so, preferentially obtaining data required from existing index data. As shown in Figure 3, the process of modifying the index definition includes the following steps:
步骤 S210、 数据库收到用户发出的修改某个索引定义命令, 需要维护 该索引的索引数据。  Step S210: The database receives the modification command of the index issued by the user, and needs to maintain the index data of the index.
修改索引定义, 例如, 修改某个索引需要创建的索引数据范围等。 步骤 S220、 分析判断是否需要重新计算和排序, 如果需要, 则执行步 骤 S230; 否则, 执行步骤 S240。  Modify the index definition, for example, modify the index data range that an index needs to create, and so on. Step S220: Analyze whether it is necessary to recalculate and sort, if necessary, execute step S230; otherwise, execute step S240.
数据库分析当前状况, 包括: 表上有哪些索引 (包括被修改定义的索 引), 每个索引具有哪些数据范围, 需要创建的索引数据范围是什么; 然后 判断当前是否有索引数据可以使用, 如果是, 则表明不需要重新计算和排 序, 进行步骤 S240; 如果否, 则进行步骤 S230, 以重新计算和排序。  The database analyzes the current status, including: which indexes on the table (including the index defined by the modification), what data ranges each index has, and what is the range of index data that needs to be created; and then determines whether there is currently index data available, if , indicating that no recalculation and sorting are required, step S240 is performed; if no, step S230 is performed to recalculate and sort.
步骤 S230、 计算索引数据。  Step S230: Calculate index data.
读取表数据, 计算排序, 获得索引数据。  Read the table data, calculate the sort, and get the index data.
步骤 S240、 找到位置, 更新索引。  Step S240, find the location, and update the index.
删除被修改的索引所有索引数据, 找到在索引上的新的指针位置, 写 入新的索引数据。  Delete all index data of the modified index, find the new pointer position on the index, and write the new index data.
步骤 S250、 判断是否还有数据要处理, 如果需要则返回步骤 S220, 处 理其余数据, 直到处理完成; 否则, 执行步骤 S260。  Step S250: Determine whether there is still data to be processed, and if necessary, return to step S220 to process the remaining data until the processing is completed; otherwise, execute step S260.
步骤 S260、 结束。  Step S260, ending.
图 4为本发明实施例中插入数据管理索引数据实现流程图, 本发明实 施例采用的技术方案体现在: 如果多个索引需要相同索引数据(多个索引 的某些索引数据可能相同), 只计算排序一次, 其它索引不需要重新计算和 排序, 如图 4所示, 插入数据管理索引数据的处理包括如下步骤: 步骤 S310、 数据库收到用户发出的插入数据命令, 需要维护表索引数 据。 4 is a flowchart of implementing data management index data insertion in an embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is as follows: If multiple indexes require the same index data (some index data of multiple indexes may be the same), only Calculate sorting once, other indexes do not need to be recalculated and Sorting, as shown in FIG. 4, the process of inserting data management index data includes the following steps: Step S310: The database receives the insert data command issued by the user, and needs to maintain the table index data.
步骤 S320、 分析判断是否需要重新计算和排序, 如果需要, 则执行步 骤 S330; 否则, 执行步骤 S340。  Step S320: Analyze whether it is necessary to recalculate and sort, if necessary, execute step S330; otherwise, execute step S340.
针对某一个索引, 如果需要维护其索引数据, 分析当前状况, 需要的 索引数据是否已经被计算排序, 如果是, 则表明索引数据可以直接使用, 进行步骤 S340; 如果否, 则进行步骤 S330, 以进行计算和排序。  For an index, if it needs to maintain its index data, analyze the current situation, whether the required index data has been calculated and sorted, if yes, it indicates that the index data can be directly used, proceed to step S340; if not, proceed to step S330 to Perform calculations and sorting.
步骤 S330、 计算索引数据。  Step S330, calculating index data.
根据索引定义, 将当前插入数据计算排序, 获得索引数据。  According to the index definition, the current inserted data is calculated and sorted to obtain index data.
步骤 S340、 找到在该索引上的指针位置, 写入索引数据。  Step S340: Find a pointer position on the index, and write index data.
步骤 S350、 判断表上是否还有索引需要维护, 如果继需要则返回步骤 S320; 如果没有, 则转步骤 S360。  Step S350: determining whether there is an index on the table for maintenance, if it is necessary, returning to step S320; if not, proceeding to step S360.
步骤 S360, 结束。  Step S360, ending.
图 5是本发明实施例中更新数据管理索引数据的实现流程图, 本发明 实施例采用的技术方案体现在: 如果多个索引需要相同索引数据, 只计算 排序一次, 其它索引不需要重新计算和排序, 如图 5 所示, 更新数据管理 索引数据的处理包括如下步骤:  FIG. 5 is a flowchart of an implementation of updating data management index data in an embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is as follows: If multiple indexes require the same index data, only sorting is performed once, and other indexes do not need to be recalculated and Sorting, as shown in Figure 5, the process of updating the data management index data includes the following steps:
步骤 S410、 数据库收到用户发出的更新数据命令, 需要维护该表对应 的索引数据。  Step S410: The database receives the update data command sent by the user, and needs to maintain the index data corresponding to the table.
步骤 S420、 分析判断是否需要重新计算和排序, 如果需要, 则执行步 骤 S430; 否则, 执行步骤 S440。  Step S420, analyzing whether it is necessary to recalculate and sort, if necessary, executing step S430; otherwise, executing step S440.
针对某一个索引, 如果需要维护其索引数据, 分析当前状况, 需要的 索引数据是否已经被计算排序, 如果是, 则表明索引数据可以直接使用, 进行步骤 S440; 如果否, 则进行步骤 S430, 以进行计算和排序。 步骤 S430、 根据索引定义方法, 对当前更新数据计算排序, 获得新的 索引数据, 然后转步骤 S440。 For an index, if it needs to maintain its index data, analyze the current situation, whether the required index data has been calculated and sorted, if yes, it indicates that the index data can be directly used, proceed to step S440; if not, proceed to step S430, Perform calculations and sorting. Step S430: Calculate the current update data according to the index definition method, obtain new index data, and then go to step S440.
步骤 S440、 找到位置, 写入索引。  Step S440, find the location, and write the index.
找到在该索引上的指针位置, 删除该指针位置上的索引数据(即老的 索引数据), 写入新索引数据。  Find the pointer position on the index, delete the index data (that is, the old index data) at the pointer position, and write the new index data.
如索引数据已计算排序且一致, 可以先删除然后再重新写入, 也可以 不执行删除、 重新写入的操作。  If the index data has been sorted and consistent, you can delete it and then re-write it, or you can not delete or re-write it.
步骤 S450、判断表上是否还有索引需要维护,如果有则继续步骤 S420; 如果没有, 则执行步骤 S460。  Step S450: determining whether there is an index on the table for maintenance, if yes, proceeding to step S420; if not, executing step S460.
步骤 S460、 结束。  Step S460, ending.
图 6是本发明实施例中合并多个索引的实现流程图, 本发明实施例采 用的技术方案体现在: 通过分析待合并的多个索引的计算相似性及对应的 索引数据的范围, 获取新索引的索引数据, 进而对获取到的索引数据进行 合并, 剔除重复的索引数据, 将所述多个索引合并为一个新索引, 如图 6 所示, 合并多个索引的处理包括如下步骤:  FIG. 6 is a flowchart of an implementation of combining multiple indexes according to an embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is as follows: obtaining a new by analyzing the computational similarity of multiple indexes to be merged and the range of corresponding index data. Indexing the index data, and then merging the obtained index data, culling the repeated index data, and merging the plurality of indexes into a new index. As shown in FIG. 6, the process of merging the multiple indexes includes the following steps:
步骤 S510、 数据库收到用户发出的合并索引命令, 将多个索引合并为 一个索引命令, 这些索引数据应该具有相同或者相似的计算方法。  Step S510: The database receives the merge index command issued by the user, and combines the multiple indexes into one index command, and the index data should have the same or similar calculation method.
例如, 具有索引包括相同字段, 且计算方法相同或者相似。  For example, having an index includes the same field, and the calculation methods are the same or similar.
步骤 S520、 合并索引数据。  Step S520: Combine the index data.
由于索引是有序排列的, 可以按照指定顺序将多个索引合并为一个索 引。  Since the indexes are ordered, you can combine multiple indexes into one index in the specified order.
对于采用链表存储索引的情况, 合并的方法可以是修改链表指针, 将 多个索引头尾相连; 然后通过分析每个索引的数据范围, 将重复的索引数 据剔除。  For the case of storing indexes in a linked list, the method of merging may be to modify the linked list pointers, and connect the multiple indexes at the beginning and the end; then, by analyzing the data range of each index, the duplicate index data is culled.
步骤 S530、 更新系统表。 修改数据库数据字典, 将以前的多个索引信息删除, 插入新索引信息。 步骤 S540、 结束。 Step S530, updating the system table. Modify the database data dictionary, delete the previous multiple index information, and insert new index information. Step S540, ending.
图 7是本发明实施例中分裂索引为多个索引的实现流程图, 本发明实 施例采用的技术方案体现在: 从已有的索引数据中获取一个待分裂索引对 应的索引数据, 然后按照指定的分裂方法将索引数据分为多个部分, 将所 述待分裂索引对应分裂为多个索引, 如分裂后的索引之间存在索引数据范 围重复, 则拷贝重复的索引数据, 如图 7 所示, 分裂索引为多个索引的处 理包括如下步骤:  FIG. 7 is a flowchart of an implementation of splitting an index into multiple indexes according to an embodiment of the present invention. The technical solution adopted by the embodiment of the present invention is: obtaining index data corresponding to an index to be split from existing index data, and then specifying The splitting method divides the index data into multiple parts, and splits the to-be-split index into multiple indexes. For example, if there is a duplicate index data range between the split indexes, the duplicate index data is copied, as shown in FIG. The process of splitting an index into multiple indexes includes the following steps:
步骤 S610、 数据库收到用户发出的分裂索引命令, 即将一个索引分裂 为多个索引的命令。  Step S610: The database receives a split index command sent by the user, and splits an index into multiple index commands.
分裂方法可以是按照数据范围划分(例如: 时间字段), 且可以是子索 引保持原来计算方法, 索引数据范围相互可以不重复。  The splitting method can be divided according to the data range (for example: time field), and the sub-index can be kept in the original calculation method, and the index data ranges can be mutually exclusive.
步骤 S620、 分裂索引数据。  Step S620, splitting the index data.
由于索引是有序排列的, 所以只需要按照指定顺序, 遍历一次索引即 可; 对于采用链表存储索引的情况, 分裂的方法可以是修改链表指针, 将 链表打断; 如果子索引范围有重复, 需要将重复的索引数据拷贝一次。  Since the indexes are ordered, it is only necessary to traverse the index once in the specified order. For the case of storing the index in a linked list, the split method may be to modify the linked list pointer and interrupt the linked list; if the sub-index range is repeated, Need to copy the duplicate index data once.
步骤 S630、 更新系统表。  Step S630, updating the system table.
修改数据库数据字典, 将以前的索引信息删除, 插入分裂的索引信息。 步骤 S640、 结束。  Modify the database data dictionary, delete the previous index information, and insert the split index information. Step S640, ending.
图 8为本发明实施例的管理索引的装置的示意图, 该装置可以运行以 上所述的数据库, 如图 8所示, 管理索引的装置包括:  FIG. 8 is a schematic diagram of an apparatus for managing an index according to an embodiment of the present invention. The apparatus may run the database as described above. As shown in FIG. 8, the apparatus for managing an index includes:
获取模块 81, 配置为接收到管理索引命令后, 从已有的索引数据中获 取相应的索引数据, 如未获取到, 则从数据库表中读取数据, 通过计算排 序后获取相应的索引数据;  The obtaining module 81 is configured to: after receiving the management index command, obtain the corresponding index data from the existing index data, and if not obtained, read the data from the database table, and obtain the corresponding index data by calculating the sorting;
管理模块 82, 配置为根据所述管理索引命令对获取到的索引数据进行 管理操作。 The management module 82 is configured to perform the obtained index data according to the management index command. Manage operations.
作为一个实施方式, 所述获取模块 81, 还可以配置为接收到合并索引 命令后, 通过分析待合并的多个索引的计算相似性、 及所述待合并的多个 索引对应的索引数据的范围, 获取新索引的索引数据;  As an implementation manner, the acquiring module 81 may be further configured to: after receiving the merge index command, analyze the calculated similarity of the multiple indexes to be merged, and the range of the index data corresponding to the multiple indexes to be merged , obtaining index data of the new index;
所述管理模块 82, 还可以配置为对获取到的索引数据进行合并, 剔除 合并的索引数据中重复的索引数据, 将所述多个索引合并为一个新索引。  The management module 82 may be further configured to merge the acquired index data, cull the duplicate index data in the merged index data, and merge the multiple indexes into one new index.
作为一个实施方式, 所述获取模块 81, 还可以配置为接收到分裂索引 命令后, 从已有的索引数据中获取一个待分裂索引对应的索引数据;  As an implementation manner, the acquiring module 81 may be configured to: after receiving the split index command, obtain index data corresponding to the index to be split from the existing index data;
所述管理模块 82, 还可以配置为按照指定的分裂方法将索引数据分为 多个部分, 将所述待分裂索引对应分裂为多个索引, 如分裂后的索引之间 存在索引数据范围重复, 则拷贝重复的索引数据。  The management module 82 may be configured to divide the index data into a plurality of parts according to the specified splitting method, and split the index to be split into multiple indexes, for example, the index data ranges are overlapped between the split indexes. Then copy the duplicate index data.
其中,所述获取模块 81接收到的管理索引命令可以包括以下的任一个: 创建索引命令、 修改索引定义命令、 插入数据命令、 更新数据命令、 合并 索引命令和分裂索引命令。  The management index command received by the obtaining module 81 may include any one of the following: an index creation command, a modification index definition command, an insert data command, an update data command, a merge index command, and a split index command.
实际应用中, 所述获取模块 81、 管理模块 82可由管理索引的装置的中 央处理器(CPU )、 微处理器(MPU )、 数字信号处理器(DSP )、 或现场可 编程门阵列 (FPGA ) 实现。  In an actual application, the obtaining module 81 and the management module 82 may be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA) of a device that manages the index. achieve.
本发明实施例还记载一种计算机存储介质, 所述计算机存储介质中 存储有计算机可执行指令, 所述计算机可执行指令配置为执行图 1所示 的告警处理优先级确定方法。  The embodiment of the invention further describes a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to perform the alarm processing priority determination method shown in FIG.
本领域内的技术人员应明白, 本发明实施例可提供为方法、 系统、 或 计算机程序产品。 因此, 本发明可采用硬件实施例、 软件实施例、 或结合 软件和硬件方面的实施例的形式。 而且, 本发明可采用在一个或多个其中 包含有计算机可用程序代码的计算机可用存储介质 (包括但不限于磁盘存 储器和光学存储器等)上实施的计算机程序产品的形式。 本发明是参照根据本发明实施例的方法、 设备(系统)、 和计算机程序 产品的流程图和 /或方框图来描述的。 应理解可由计算机程序指令实现流程 图和 /或方框图中的每一流程和 /或方框、以及流程图和 /或方框图中的流程和 /或方框的结合。 可提供这些计算机程序指令到通用计算机、 专用计算机、 嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器, 使得 通过计算机或其他可编程数据处理设备的处理器执行的指令产生配置为实 现在流程图一个流程或多个流程和 /或方框图一个方框或多个方框中指定的 功能的装置。 Those skilled in the art will appreciate that embodiments of the invention may be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware aspects. Moreover, the invention can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage and optical storage, etc.) in which computer usable program code is embodied. The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine that causes configuration of instructions executed by a processor of a computer or other programmable data processing device Means for implementing the functions specified in a block or blocks of a flow or a flow and/or a block diagram of a flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理 设备以特定方式工作的计算机可读存储器中, 使得存储在该计算机可读存 储器中的指令产生包括指令装置的制造品, 该指令装置实现在流程图一个 流程或多个流程和 /或方框图一个方框或多个方框中指定的功能。  The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备 上, 使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机 实现的处理, 从而在计算机或其他可编程设备上执行的指令提供配置为实 现在流程图一个流程或多个流程和 /或方框图一个方框或多个方框中指定的 功能的步骤。  These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps that are configured to implement the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.
以上所述仅是本发明的优选实施方式, 应当指出, 对于本技术领域的 普通技术人员来说, 在不脱离本发明原理的前提下, 还可以做出若干改进 和润饰, 这些改进和润饰也应视为本发明的保护范围。  The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims

权利要求书 claims
1、 一种管理索引的方法, 包括: 1. A method of managing indexes, including:
接收到管理索引命令后, 从已有的索引数据中获取相应的索引数据, 如未获取到, 则从数据库表中读取数据, 通过计算排序后获取相应的索引 数据; After receiving the management index command, the corresponding index data is obtained from the existing index data. If it is not obtained, the data is read from the database table, and the corresponding index data is obtained after calculation and sorting;
根据所述管理索引命令对获取到的索引数据进行管理操作。 Perform management operations on the acquired index data according to the management index command.
2、 如权利要求 1所述的方法, 其中, 所述从已有的索引数据中获取相 应的索引数据, 包括: 2. The method of claim 1, wherein obtaining corresponding index data from existing index data includes:
通过分析待合并的多个索引的计算相似性、 及所述待合并的多个索引 对应的索引数据的范围, 确定新索引的索引数据。 The index data of the new index is determined by analyzing the calculated similarities of the multiple indexes to be merged and the range of index data corresponding to the multiple indexes to be merged.
3、 如权利要求 2所述的方法, 其中, 所述根据所述管理索引命令对获 取到的索引数据进行管理操作, 包括: 3. The method of claim 2, wherein the management operation on the acquired index data according to the management index command includes:
对获取到的索引数据进行合并, 剔除合并的索引数据中重复的索引数 据, 将所述待合并的多个索引合并为一个新索引。 Merge the obtained index data, remove duplicate index data from the merged index data, and merge the multiple indexes to be merged into a new index.
4、 如权利要求 1所述的方法, 其中, 所述从已有的索引数据中获取相 应的索引数据, 包括: 4. The method of claim 1, wherein obtaining corresponding index data from existing index data includes:
从已有的索引数据中获取待分裂索引对应的索引数据。 Obtain the index data corresponding to the index to be split from the existing index data.
5、 如权利要求 4所述的方法, 其中, 所述根据所述管理索引命令对获 取到的索引数据进行管理操作, 包括: 5. The method of claim 4, wherein the management operation on the acquired index data according to the management index command includes:
按照指定的分裂方法将所述待分裂索引对应的索引数据分为多个部 分, 将所述待分裂索引对应分裂为多个索引, 如分裂后的索引之间存在索 引数据范围重复, 则拷贝重复的索引数据。 Divide the index data corresponding to the index to be split into multiple parts according to the specified splitting method, and split the index to be split into multiple indexes. If there is duplication of index data ranges between the split indexes, the copies will be repeated. index data.
6、 如权利要求 1至 5任一项所述的方法, 其中, 所述管理索引命令包 括以下的任意一个: 6. The method according to any one of claims 1 to 5, wherein the management index command includes any one of the following:
创建索引命令; 修改索引定义命令; 插入数据命令; 更新数据命令; 合并索引命令; 分裂索引命令。 Create index command; Modify index definition command; Insert data command; Update data command; Merge index command; Split index command.
7、 一种管理索引的装置, 包括: 7. A device for managing indexes, including:
获取模块, 配置为接收到管理索引命令后, 从已有的索引数据中获取 相应的索引数据, 如未获取到, 则从数据库表中读取数据, 通过计算排序 后获取相应的索引数据; The acquisition module is configured to obtain the corresponding index data from the existing index data after receiving the management index command. If it is not obtained, the data is read from the database table, and the corresponding index data is obtained after sorting by calculation;
管理模块, 配置为根据所述管理索引命令对获取到的索引数据进行管 理操作。 A management module configured to perform management operations on the acquired index data according to the management index command.
8、 如权利要求 7所述的装置, 其中, 8. The device of claim 7, wherein,
所述获取模块, 还配置为接收到合并索引命令后, 通过分析待合并的 多个索引的计算相似性、 及所述待合并的多个索引对应的索引数据的范围, 确定新索引的索引数据。 The acquisition module is also configured to, after receiving the merge index command, determine the index data of the new index by analyzing the calculated similarities of the multiple indexes to be merged and the range of index data corresponding to the multiple indexes to be merged. .
9、 如权利要求 8所述的装置, 其中, 9. The device of claim 8, wherein,
所述管理模块, 还配置为对获取到的索引数据进行合并, 剔除合并的 索引数据中重复的索引数据, 将所述多个索引合并为一个新索引。 The management module is also configured to merge the acquired index data, eliminate duplicate index data from the merged index data, and merge the multiple indexes into a new index.
10、 如权利要求 6所述的装置, 其中, 10. The device of claim 6, wherein,
所述获取模块, 还配置为接收到分裂索引命令后, 从已有的索引数据 中获取待分裂索弓 I对应的索引数据。 The acquisition module is also configured to acquire the index data corresponding to the index I to be split from the existing index data after receiving the split index command.
11、 如权利要求 10所述的装置, 其中, 11. The device of claim 10, wherein,
所述管理模块, 还配置为按照指定的分裂方法将索引数据分为多个部 分, 将所述待分裂索引对应分裂为多个索引, 如分裂后的索引之间存在索 引数据范围重复, 则拷贝重复的索引数据。 The management module is also configured to divide the index data into multiple parts according to the specified splitting method, and split the index to be split into multiple indexes. If there is duplication of index data ranges between the split indexes, copy Duplicate index data.
12、 如权利要求 7至 11任一项所述的装置, 其中, 12. The device according to any one of claims 7 to 11, wherein,
所述获取模块, 接收到的管理索引命令包括以下的任意一个: 创建索 引命令、 修改索引定义命令、 插入数据命令、 更新数据命令、 合并索引命 令和分裂索引命令。 The management index commands received by the acquisition module include any one of the following: create index command, modify index definition command, insert data command, update data command, merge index command and split index command.
13、一种计算机存储介质, 所述计算机存储介质中存储有计算机可执 行指令, 所述计算机可执行指令用于执行权利要求 1至 7任一项所述的 管理索引的方法。 13. A computer storage medium, the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the method of managing an index according to any one of claims 1 to 7.
PCT/CN2014/079517 2014-04-24 2014-06-09 Index management method and device, and computer storage medium WO2015161550A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410168535.XA CN105022743A (en) 2014-04-24 2014-04-24 Index management method and index management device
CN201410168535.X 2014-04-24

Publications (1)

Publication Number Publication Date
WO2015161550A1 true WO2015161550A1 (en) 2015-10-29

Family

ID=54331665

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/079517 WO2015161550A1 (en) 2014-04-24 2014-06-09 Index management method and device, and computer storage medium

Country Status (2)

Country Link
CN (1) CN105022743A (en)
WO (1) WO2015161550A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460052B (en) * 2017-02-22 2022-11-01 中兴通讯股份有限公司 Method and device for automatically creating index and database system
CN107247639A (en) * 2017-05-03 2017-10-13 上海动联信息技术股份有限公司 A kind of efficient backup method of mysql databases

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6272486B1 (en) * 1998-04-16 2001-08-07 International Business Machines Corporation Determining the optimal number of tasks for building a database index
CN102023991A (en) * 2009-09-21 2011-04-20 中兴通讯股份有限公司 Method and device for updating indexes on terminal and sorting search results on the basis of updated indexes
CN102207935A (en) * 2010-03-30 2011-10-05 国际商业机器公司 Method and system for establishing index
US8055645B1 (en) * 2006-12-15 2011-11-08 Packeteer, Inc. Hierarchical index for enhanced storage of file changes
CN102332029A (en) * 2011-10-15 2012-01-25 西安交通大学 Hadoop-based mass classifiable small file association storage method
CN103810175A (en) * 2012-11-06 2014-05-21 凌群电脑股份有限公司 Method for automatically establishing data indexes

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117309B (en) * 2010-01-06 2013-04-17 卓望数码技术(深圳)有限公司 Data caching system and data query method
CN102033954B (en) * 2010-12-24 2012-10-17 东北大学 Full text retrieval inquiry index method for extensible markup language document in relational database
CN102314506B (en) * 2011-09-07 2015-09-09 北京人大金仓信息技术股份有限公司 Based on the distributed buffering district management method of dynamic index
CN103544156B (en) * 2012-07-10 2019-04-09 腾讯科技(深圳)有限公司 File memory method and device
CN103544261B (en) * 2013-10-16 2016-06-22 国家计算机网络与信息安全管理中心 A kind of magnanimity structuring daily record data global index's management method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6272486B1 (en) * 1998-04-16 2001-08-07 International Business Machines Corporation Determining the optimal number of tasks for building a database index
US8055645B1 (en) * 2006-12-15 2011-11-08 Packeteer, Inc. Hierarchical index for enhanced storage of file changes
CN102023991A (en) * 2009-09-21 2011-04-20 中兴通讯股份有限公司 Method and device for updating indexes on terminal and sorting search results on the basis of updated indexes
CN102207935A (en) * 2010-03-30 2011-10-05 国际商业机器公司 Method and system for establishing index
CN102332029A (en) * 2011-10-15 2012-01-25 西安交通大学 Hadoop-based mass classifiable small file association storage method
CN103810175A (en) * 2012-11-06 2014-05-21 凌群电脑股份有限公司 Method for automatically establishing data indexes

Also Published As

Publication number Publication date
CN105022743A (en) 2015-11-04

Similar Documents

Publication Publication Date Title
Li et al. Distributed data management using MapReduce
Carbone et al. Apache flink: Stream and batch processing in a single engine
Doulkeridis et al. A survey of large-scale analytical query processing in MapReduce
Eldawy SpatialHadoop: towards flexible and scalable spatial processing using mapreduce
Borkar et al. Hyracks: A flexible and extensible foundation for data-intensive computing
US10997124B2 (en) Query integration across databases and file systems
WO2017096892A1 (en) Index construction method, search method, and corresponding device, apparatus, and computer storage medium
JP6598996B2 (en) Signature-based cache optimization for data preparation
JP2006018632A (en) Index addition program for relational data base, index addition device and index addition method
Yang et al. F1 Lightning: HTAP as a Service
You et al. Spatial join query processing in cloud: Analyzing design choices and performance comparisons
US11714794B2 (en) Method and apparatus for reading data maintained in a tree data structure
US20160210228A1 (en) Asynchronous garbage collection in a distributed database system
JP6598997B2 (en) Cache optimization for data preparation
CN111680017A (en) Data synchronization method and device
CN115552390A (en) Server-free data lake indexing subsystem and application programming interface
CN105353988A (en) Metadata reading and writing method and device
Al-Khasawneh et al. MapReduce a comprehensive review
US10838931B1 (en) Use of stream-oriented log data structure for full-text search oriented inverted index metadata
WO2015168988A1 (en) Data index creation method and device, and computer storage medium
WO2015161550A1 (en) Index management method and device, and computer storage medium
CN113672556A (en) Batch file migration method and device
CN109710698B (en) Data aggregation method and device, electronic equipment and medium
Wang et al. DOMe: A deduplication optimization method for the NewSQL database backups
US11609909B2 (en) Zero copy optimization for select * queries

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14889789

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14889789

Country of ref document: EP

Kind code of ref document: A1