WO2015168988A1 - Data index creation method and device, and computer storage medium - Google Patents

Data index creation method and device, and computer storage medium Download PDF

Info

Publication number
WO2015168988A1
WO2015168988A1 PCT/CN2014/082640 CN2014082640W WO2015168988A1 WO 2015168988 A1 WO2015168988 A1 WO 2015168988A1 CN 2014082640 W CN2014082640 W CN 2014082640W WO 2015168988 A1 WO2015168988 A1 WO 2015168988A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
index
table data
filtering condition
target table
Prior art date
Application number
PCT/CN2014/082640
Other languages
French (fr)
Chinese (zh)
Inventor
谢东
喻红宇
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015168988A1 publication Critical patent/WO2015168988A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of database technologies, and in particular, to a data index creation method, apparatus, and computer storage medium. Background technique
  • a database is a data processing device that has been developed to meet the needs of data processing.
  • the germination of the database system appeared in 1960, and in 1970 the concept of a relational model of the database was proposed, on the basis of which a relational database was formed.
  • the relational database is a database based on the relational database model and has a solid mathematical theoretical basis.
  • the database has penetrated into various industries and applications, and the relational database has been widely used in various industries.
  • the world has undergone earth-shaking changes, and the data characteristics have changed greatly compared to the era when the database concept was just presented.
  • Those complex and large amounts of data are collectively referred to as big data.
  • traditional relational database systems have shown more and more drawbacks.
  • SQL Structured Query Language
  • IBM US National Bureau of Standards adopted the American Standard for Database Language.
  • ISO International Organization for Standardization
  • ISO introduced the SQL89 standard with integrity characteristics.
  • ISO published the SQL92 standard.
  • the cable is a structure for sorting the values in the data table of the database.
  • the index can be used to quickly access specific information in the database table.
  • the method for creating a database index is: by analyzing the application usage scenario of the database, and analyzing the query conditions of the SQL statement, and then obtaining the index and its candidate fields that may be needed, and finally considering various factors such as performance, One or more columns of the data table are used as index fields to create one or more indexes. Since each index is for all data in the data table, the value of the index field of any data in the data table corresponds to one index data in the index. If the value of the index field of any piece of data is changed, deleted, or changed, the index data needs to be added or deleted.
  • the index is used during the execution of the SQL statement to enable the user to quickly access the data, but because some of the data in the index has not been used by SQL for a long time, the utilization of the index data is reduced.
  • the embodiment of the present invention provides a data index creation method, which includes the following steps:
  • the target table data is selected to obtain table data satisfying the data filtering condition, and an index is created for the target table data satisfying the data filtering condition.
  • the target table data is selected according to the data filtering condition to obtain table data that meets the data filtering condition, and an index is created for the table data that meets the data filtering condition, including: Read the target table data in the database;
  • the target table data that satisfies the data filtering condition is calculated to obtain index data
  • the index data is written to the location of the corresponding pointer on the index.
  • the method further includes:
  • the obtaining the data filtering condition is: obtaining a data filtering condition by using an SQL command for creating an index.
  • the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, a filtering condition according to the target group used to create the index.
  • the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list word.
  • the embodiment of the invention further provides a data index creation device, including:
  • a data filtering condition obtaining module configured to obtain a data filtering condition
  • the index creation module is configured to select, according to the data filtering condition acquired by the data filtering condition acquisition module, the table data that meets the filtering condition, and create an index on the table data that satisfies the data filtering condition.
  • the index creation module includes:
  • the data filtering condition determining unit is configured to determine whether the target table data read by the table data reading unit satisfies the data filtering condition, and trigger a computing unit when the target table data satisfies the data filtering condition;
  • a calculating unit configured to calculate target data that meets the data filtering condition to obtain index data
  • the index data writing unit is configured to write the index data obtained by the calculating unit to a position of a corresponding pointer on the index.
  • the index creation module further includes a data discriminating unit configured to determine whether an unread table exists in the database after the index data writing unit writes the index data to a position of a corresponding pointer on the index. Data, and if the index is not created, triggering the table data reading unit to continue reading the target table data in the database.
  • a data discriminating unit configured to determine whether an unread table exists in the database after the index data writing unit writes the index data to a position of a corresponding pointer on the index. Data, and if the index is not created, triggering the table data reading unit to continue reading the target table data in the database.
  • the data filtering condition obtaining module is configured to acquire a data filtering condition by using an indexed SQL command.
  • the data is big data
  • the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, a selection condition according to the target group used to create the index.
  • the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list word.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
  • the data index creation method filters the target table data according to certain data filtering conditions when the index is created, and filters the target table data that does not satisfy the data filtering condition. Indexes are only created for a portion of the data in the database, so the amount of data in the index is greatly reduced. At the same time filtered, basically not used The target table data is filtered out of the index, which makes the index usage efficiency significantly improved. In addition, because of the filtering of redundant table data, the speed at which users access the database through the index is improved. The maintenance cost of the index is also reduced.
  • the operation of adding and deleting data becomes simple; the speed of creating the index is improved, the difficulty is reduced, and the time is reduced, making it easier for the user to create a new index according to his needs.
  • the method provided by the embodiments of the present invention can form an index more quickly.
  • the data index creation method of the embodiment of the present invention allows the database developer to manage the database according to the actual life and according to the data life cycle: According to the embodiment of the present invention, the database developer can first determine the screening condition of the index through the user requirement. Establish a corresponding index; then maintain the index according to the data life cycle. When creating an index, there is no longer a need to separate hot and cold data from the table, nor data migration, and the database design will be more reasonable.
  • the data index creation device further provided by the embodiment of the present invention has a small amount of index data, low maintenance cost, and high query query execution efficiency. In addition, it is more conducive to the rational design and management of the database.
  • FIG. 1 is a flowchart of a method for creating a data index according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for creating a data index according to a preferred embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a data index creation apparatus according to an embodiment of the present invention. detailed description
  • a data index creation method includes the following steps: acquiring a data filtering condition;
  • the target table data is selected to obtain table data satisfying the data filtering condition, and an index is created for the target table data satisfying the data filtering condition.
  • FIG. 1 is a flowchart of a method for creating a data index according to an embodiment of the present invention; the foregoing method may be implemented by using the process shown in FIG. 1, and includes the following steps:
  • Step 101 Obtain data filtering conditions.
  • the obtaining data filtering condition is: acquiring a data filtering condition by using an SQL command for creating an index.
  • the data filtering condition is obtained by analyzing user software requirements, generating an SQL command for creating an index, and resolving the SQL command for creating the index.
  • the data filtering condition may also be that the database receives an instruction for the user to create an index, and obtains a data filtering condition from the instruction for creating the index.
  • Step 102 Filter the target table data according to the data filtering condition to obtain table data that satisfies the data filtering condition, and create an index on the table data that satisfies the data filtering condition.
  • the target table data may be judged one by one whether the data filtering condition is met, and when it is determined that the target table data satisfies the data filtering condition, the target table data is calculated to obtain index data;
  • the index data calculated by satisfying the data filter condition target table data constitutes a complete index.
  • the data filtering condition may be a range condition for selecting the data content of the target table, for example, a range condition for filtering the name, the attribute, the size, the recording time, and the like.
  • all the target table data satisfying the data filtering condition may be selected first, and then the target table data satisfying the data filtering condition may be calculated one by one, and finally a complete index is created.
  • the method for creating a data index provided by the embodiment of the present invention is based on a certain The data filtering condition filters the target table data, filters the target table data that does not satisfy the data filtering condition, and creates an index only for a certain part of the data in the database, so that the amount of data in the index is greatly reduced.
  • the data can be filtered according to the needs of use, which makes the use efficiency of the index significantly improved.
  • the method provided by the present invention has more advantages in terms of creation efficiency and the like.
  • the speed at which users access the database through the index is improved.
  • the maintenance cost of the index is also reduced. After the index is reduced, the operation of adding and deleting data becomes simple; the speed of creating the index is improved, the difficulty is reduced, and the time is reduced, making it easier for the user to create a new index according to his needs.
  • the data filtering condition is obtained by analyzing a creation instruction sent by the user in a SQL manner, and the foregoing step 101 may include:
  • Parsing the SQL instruction that creates the index and obtaining a data filtering condition from the SQL instruction that creates the index.
  • Step 201 Read target table data in the database.
  • the target table data refers to table data in a data table related to an index in a database.
  • Step 202 Determine whether the target table data satisfies the data filtering condition. When the result of the determination is yes, step 203 is performed; when the result of the determination is no, step 205 is performed.
  • Step 203 Perform calculation on the table data that satisfies the filtering condition to obtain index data.
  • Step 204 Write the index data to the position of the corresponding pointer on the index.
  • Step 205 Determine whether the index is created. When the result of the determination is yes, the operation ends; when the result of the determination is no, the process returns to step 201.
  • the target data table refers to a data table corresponding to the target table data. If it is determined in step 205 that the index has been created, the operation ends.
  • a flow including at least steps 201 through 204 should be performed.
  • the data filtering condition includes a data filtering condition included in a user create index command.
  • the data filtering condition includes: obtaining a data filtering condition from a SQL statement corresponding to the user creating an index command.
  • the data filtering condition includes a data filtering condition that reflects a user operation requirement; and the data filtering condition may be used by analyzing a possible operation when the user creates an index on the target data table, from the SQL statement corresponding to the operation. Obtain and delete the same data filtering conditions contained in different SQL statements to obtain multiple different or one data filtering conditions required to create an index.
  • the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, the index used to create the index The selection criteria on which the target group is based.
  • the data in the big data target data table is grouped according to the data life cycle; then, in this embodiment, at least one group is selected from the data group as the target group according to the data filtering condition, and the data filtering is satisfied.
  • the data filtering condition includes a filtering condition upon which a target group is selected from the at least one group.
  • the data table structure when grouping the data in the at least one target data table, can be analyzed to find a field that can be used as a time axis, and the fields are formed into a time axis, and the table data is obtained according to the time axis in the life cycle.
  • the table data in the database is divided. More specifically, it can be divided into "days".
  • At least one of the groups is selected as the target group, and an index is created for the target group, and the selection condition of the target group includes data filtering in creating the index. In the condition.
  • Such data filtering conditions are added to the SQL statement that creates the index, and finally the index of the target table data is created according to the SQL statement that creates the index.
  • the data grouping method used in the prior art often separates data in the same time period.
  • the hotspot data and the non-hotspot data are separated, and an index is separately created.
  • the data index structure of the same time phase is basically the same, for example, the index structure of the hotspot data and the non-hotspot data is substantially the same, and the method provided by the above embodiment of the present invention can be made from the data in the database according to the data life cycle.
  • Selecting the target group in the grouping, obtaining the table data satisfying the filtering condition, and establishing the corresponding index is advantageous for creating a unified index for different types of data with the same index structure, and thus has higher efficiency in the process of processing big data.
  • the above analysis of the user requirements may be an analysis of the query conditions in the corresponding SQL query statement when the user accesses the database.
  • the database R&D personnel determines the screening condition of the index according to the actual needs, establishes the corresponding index, and eliminates the need to separate the hot and cold data from the table, and no data migration. Database design will be more reasonable.
  • the data index creation method provided by the embodiment of the present invention further includes: managing the index at different stages of the table data life cycle: according to the target table data in the data life cycle The corresponding phase changes, the index is changed.
  • the index established on the status field is often used to query the operation status and update; if the historical data is not updated, the index on the status field is no longer needed, and the corresponding index data needs to be deleted.
  • the index before the one-year range needs to be deleted.
  • the data filtering condition is implemented by a where statement.
  • the data filtering condition is implemented by the where statement; that is, the target table data is filtered by executing the corresponding where statement.
  • the where statement includes a where instruction and a conditional expression.
  • the conditional expression can be a combination of columns, constants, and operators.
  • the where statement can be set at the end of the current SQL creation index statement.
  • the data filtering condition is implemented by a statement containing a range of words.
  • the range word may be a range instruction, a before instruction, or the like.
  • the data filtering condition is also implemented by a statement containing a list of words.
  • the list word can be a list instruction.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
  • Step 301 Receive user creation Indexed instructions.
  • the instruction to create an index may be an SQL instruction that creates an index.
  • Step 302 Obtain a data filtering condition from the instruction for creating an index.
  • Step 303 Read the target table data.
  • the target table data refers to table data in a database corresponding to an instruction to create an index.
  • Step 304 Determine whether the target table data satisfies the data filtering condition; when the result of the determination is yes, execute step 305; when the result of the determination is no, perform step 307.
  • Step 305 Calculate the target table data to obtain index data.
  • Step 306 Write the index data to a corresponding pointer position on the index.
  • Step 307 Determine whether there is unread target table data in the database; when the result of the determination is yes, the operation flow is ended; when the result of the determination is no, the process returns to step 303.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a data index creation device according to an embodiment of the present invention; as shown in FIG. 4, the device includes:
  • the data filtering condition obtaining module 41 is configured to obtain a data filtering condition
  • the index creation module 42 is configured to select, according to the data filtering condition acquired by the data filtering condition obtaining module 41, the target table data to obtain table data that satisfies the data filtering condition, and satisfy the data filtering condition.
  • the table data creates an index.
  • the data index creation device provided by the embodiment of the present invention filters the target table data according to certain data filtering conditions when the index is created, and filters the target table data that does not satisfy the data filtering condition, and only uses the data filtering condition to the database.
  • the target table data is filtered to obtain table data satisfying the data filtering condition, and an index is created for the table data satisfying the data filtering condition, so that the amount of data in the index is greatly reduced.
  • the data filtering condition can be obtained by analyzing a creation instruction sent by the user in SQL, or by analyzing user operation requirements.
  • the data filtering condition obtaining module 41 may further include:
  • the instruction receiving unit 411 is configured to receive an instruction for the user to create an index
  • the create instruction analysis unit 412 is configured to acquire a data filter condition from the instruction to create the index received by the create instruction receiving unit 411.
  • the index creation module 42 further includes: a table data reading unit 421 configured to read target table data in the database;
  • the data filtering condition determining unit 422 is configured to determine whether the target table data read by the table data reading unit satisfies the data filtering condition, and trigger the calculating unit when the target table data satisfies the data filtering condition 423 ;
  • the calculating unit 423 is configured to calculate target data that meets the data filtering condition to obtain index data.
  • the index data writing unit 424 is configured to write the index data obtained by the calculating unit 423 to the position of the corresponding pointer on the index.
  • the index creation module 42 further includes a data discriminating unit 425 configured to, in the case of batch processing of the target table data, write the index data to the corresponding pointer on the index at the index data writing unit 424. After the location, it is judged whether the entire index is created, and if the entire index is not created, the table data reading unit 421 is triggered to continue to read the target table data in the database.
  • the data filtering condition may include a data filtering condition included in a user create index command.
  • the data filtering conditions can be obtained from analyzing user operational requirements to creating indexing instructions.
  • the data filtering condition obtaining module 41 is configured to obtain a data filtering condition by using an SQL command that creates an index.
  • the data filtering condition includes a data filtering condition that reflects a user operation requirement; the data filtering condition may be obtained by analyzing a possible query, update, and delete operation of the target table data by the user, and obtaining the SQL statement corresponding to the operation. Delete the same data filtering conditions contained in different SQL statements, obtain the main filtering conditions for the user to operate on the table, and then comprehensively consider the performance and other aspects, and finally get multiple different or ones needed to create the index. Data filtering conditions.
  • the target table data is all data in the data table.
  • the data is big data
  • the data filtering conditions include: selecting, from the grouping of the table data according to the data life cycle, a selection condition according to the target group used to create the index.
  • the data in the big data target data table is grouped according to the data life cycle; then, in this embodiment, at least one group is selected from the data group as the target group according to the data filtering condition, and the data filtering is satisfied.
  • the data filtering condition includes a filtering condition upon which a target group is selected from the at least one group.
  • the data grouping conditions obtained by the requirements analyze the data table structure, find the fields that can be used as the time axis, form these fields into the time axis, obtain the stage of the table data in the life cycle according to the time axis, and the table data in the database. Divided into multiple groups. More specifically, it can be divided by day.
  • one or more of the groupings may be selected as the target group according to the data filtering conditions obtained by analyzing the user operation requirements, and the selection condition of the target group is the data filtering condition.
  • Such data filtering conditions are added to the SQL statement that creates the index, and finally the index of the target table data is created according to the SQL statement that creates the index.
  • the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list of words.
  • the big data indexing device includes a data filtering condition acquisition module and an index creation module.
  • the data filtering condition obtaining module 41 is configured to acquire a data filtering condition; the data filtering condition obtaining module 41 includes:
  • Creating an instruction receiving unit configured to receive an instruction for the user to create an index
  • the creation instruction analysis unit is configured to acquire a data filter condition from the instruction to create an index received by the creation instruction receiving unit 411.
  • the index creation module 42 is configured to: select, according to the data filtering condition, the target table data to obtain the table data that meets the filtering condition, and create an index on the table data that meets the data filtering condition; the index creating module 42 includes:
  • the table data reading unit 421 is configured to read the target table data; as an embodiment, the target table data refers to table data in a database corresponding to the instruction for creating an index;
  • the data filtering condition determining unit 422 is configured to determine whether the target table data read by the table data reading unit 421 satisfies the data filtering condition, and trigger a calculation when the target table data satisfies the data filtering condition.
  • the calculating unit 423 is configured to calculate the target table data that satisfies the data filtering condition, to obtain index data;
  • the index data writing unit 424 is configured to write the index data obtained by the calculating unit 423 into a corresponding pointer position on the index;
  • the data discriminating unit 425 is configured to determine, after the index data writing unit 424 writes the index data to the position of the corresponding pointer on the index, whether the index is created, and an unread target table exists in the database. In the case of data, the table data reading unit 421 is triggered to continue reading the target table data in the database.
  • the network abnormality detecting processing device may be implemented by a database in an actual application; the data filtering condition acquiring module 41 and its submodules in the device: a create instruction receiving unit 411 and a create command analyzing unit 412.
  • the index creation module 42 and its sub-modules a table data reading unit 421, a data filtering condition determining unit 422, a calculating unit 423, an index data writing unit 424, and a data discriminating unit 425, which may be used by the device in practical applications.
  • the data index creation device provided by the embodiment of the present invention has a small amount of index data and a high usage rate, which can improve the speed of the user accessing the database through the index, and can also reduce the maintenance of the index. Cost, making index updates easier.
  • the data index creation device provided by the embodiment of the present invention manages data according to the big data life cycle, which helps improve the rationality of the database design.
  • embodiments of the present invention can be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the present invention is applicable to one or more computer-usable storage media (including but not limited to disks) having computer usable program code embodied therein. A form of computer program product embodied on a memory and optical storage, etc.).
  • the present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowcharts and/or block diagrams, and combinations of flow and/or blocks in the flowcharts and/or block diagrams can be implemented by computer program instructions.
  • the computer program instructions can be provided to a processor of a general purpose computer, a special purpose computer, an embedded processor, or other programmable data processing device to produce a machine such that a process or a process and/or a block diagram of a block or A device that has multiple functions specified in the box.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • the target table data when the index is created, the target table data is filtered according to certain data filtering conditions, and the target table data that does not satisfy the data filtering condition is filtered out, and only an index is created for a part of the data in the database, so that the data in the index
  • the amount will be reduced to a large extent and will not be basically
  • the target table data is filtered out of the index, which makes the use efficiency of the index significantly improved, and the speed at which the user accesses the database through the index is improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data index creation method and device, and a computer storage medium. The method comprises: acquiring a data filtering condition; and according to the data filtering condition, screening target table data to obtain table data satisfying the data filtering condition, and creating an index for the target table data satisfying the data filtering condition.

Description

一种数据索引创建方法、 装置及计算积 储介质 技术领域  Data index creation method, device and calculation storage medium
本发明涉及数据库技术领域, 尤其涉及一种数据索引创建方法、 装置 及计算机存储介质。 背景技术  The present invention relates to the field of database technologies, and in particular, to a data index creation method, apparatus, and computer storage medium. Background technique
数据库是为适应数据处理的需要而发展起来的一种数据处理装置。 数 据库系统的萌芽出现于 1960年, 在 1970年提出了数据库的关系模型的概 念, 在此基础上形成了关系数据库。  A database is a data processing device that has been developed to meet the needs of data processing. The germination of the database system appeared in 1960, and in 1970 the concept of a relational model of the database was proposed, on the basis of which a relational database was formed.
关系数据库是建立在关系数据库模型基础上的数据库, 具有坚实的数 学理论基础。 随着信息技术发展, 数据库已经渗透到各个行业和应用中, 关系数据库在各行各业更是得到了广泛应用。 但是, 世界已经发生了翻天 覆地的变化, 与刚提出数据库概念的时代相比, 数据特征变化很大。 那些 结构复杂且量大的数据, 统称为大数据。 面对这种类型数据, 传统的关系 数据库系统表现出越来越多的弊端。  The relational database is a database based on the relational database model and has a solid mathematical theoretical basis. With the development of information technology, the database has penetrated into various industries and applications, and the relational database has been widely used in various industries. However, the world has undergone earth-shaking changes, and the data characteristics have changed greatly compared to the era when the database concept was just presented. Those complex and large amounts of data are collectively referred to as big data. Faced with this type of data, traditional relational database systems have shown more and more drawbacks.
关系型数据库釆用结构化查询语言( SQL, Structured Query Language ) 作为数据库操作语言, SQL从提出到现在经过几次修订。 1974年, SQL最 初由 IBM公司实现。 1986年美国国家标准局通过数据库语言美国标准。 1987 年, 国际标准化组织(ISO )把 ANSI SQL作为国际标准。 1989年, ISO提 出了具有完整性特征的 SQL89标准。 1992年, ISO公布了 SQL92标准。  Relational databases use Structured Query Language (SQL, Structured Query Language) as the database operation language. SQL has been revised several times since its introduction. In 1974, SQL was first implemented by IBM. In 1986, the US National Bureau of Standards adopted the American Standard for Database Language. In 1987, the International Organization for Standardization (ISO) adopted ANSI SQL as an international standard. In 1989, ISO introduced the SQL89 standard with integrity characteristics. In 1992, ISO published the SQL92 standard.
索弓 ]是对数据库的数据表中的值进行排序的一种结构, 使用索引可以 快速访问数据库表中的特定信息。 现有技术中, 创建数据库索引的方法是: 通过分析应用程序对数据库的使用场景, 以及分析 SQL语句查询条件, 然 后得到可能需要的索引及其候选字段, 最后综合考虑性能等各种因素, 确 定数据表的一个或者多个列作为索引字段, 创建一个或者多个索引。 由于 每个索引都是针对数据表中所有数据的, 所以数据表中任何一条数据的索 引字段的值, 在索引中都对应一个索引数据。 如果任何一条数据的索引字 段的值发生增删改变化, 都需要增删改索引数据。 The cable is a structure for sorting the values in the data table of the database. The index can be used to quickly access specific information in the database table. In the prior art, the method for creating a database index is: by analyzing the application usage scenario of the database, and analyzing the query conditions of the SQL statement, and then obtaining the index and its candidate fields that may be needed, and finally considering various factors such as performance, One or more columns of the data table are used as index fields to create one or more indexes. Since each index is for all data in the data table, the value of the index field of any data in the data table corresponds to one index data in the index. If the value of the index field of any piece of data is changed, deleted, or changed, the index data needs to be added or deleted.
面临现代数据特征的变化, 现有技术的创建的索引中的数据量大, 占 用空间也越来越大, 使用存储成本越来越高, 进而明显影响用户访问数据 表信息的效率。 同时, 索引是在 SQL语句执行时使用以使得用户能够快速 访问数据, 但由于索引中一些数据长期没有被 SQL使用, 导致索引数据的 利用率降低。  Faced with the change of the characteristics of modern data, the amount of data in the index created by the prior art is large, the occupied space is also larger, and the storage cost is higher and higher, which obviously affects the efficiency of the user accessing the data table information. At the same time, the index is used during the execution of the SQL statement to enable the user to quickly access the data, but because some of the data in the index has not been used by SQL for a long time, the utilization of the index data is reduced.
此外, 索引随着数据增大后, 维护成本相应增加, 应用访问变得越来 越緩慢, 进而增加删除数据变得很困难, 系统扩展性差, 索引的维护成本 也提高。 在大数据时代, 数据在不断膨胀, 但是当前的商用关系数据库适 用于处理记录数在千万数据量级以内的表。 在某些大型系统中, 当表数据 量达到更高的数量级,例如:太字节( TB, TeraByte )、拍字节( PB, PetaByte )、 十万亿亿字节(ZB, ZettaByte ), 由于索引中数据量巨大, 不但插入新数据 非常困难, 而且也难以创建新的索引。 发明内容  In addition, as the data increases, the maintenance cost increases accordingly, and application access becomes more and more slow, which increases the difficulty of deleting data, the system scalability is poor, and the maintenance cost of the index is also improved. In the era of big data, data is expanding, but current commercial relational databases are suitable for processing tables with records in the order of tens of millions of data. In some large systems, when the amount of table data reaches a higher order of magnitude, for example: terabytes (TB, TeraByte), beat bytes (PB, PetaByte), ten trillions of bytes (ZB, ZettaByte), due to The amount of data in the index is huge, not only is it difficult to insert new data, but it is also difficult to create new indexes. Summary of the invention
为解决现有存在的技术问题, 本发明实施例提供了一种数据索引创建 方法, 包括如下步骤:  In order to solve the existing technical problems, the embodiment of the present invention provides a data index creation method, which includes the following steps:
获取数据过滤条件;  Obtain data filtering conditions;
根据所述数据过滤条件, 对目标表数据进行 选得到满足所述数据过 滤条件的表数据, 并对满足所述数据过滤条件的目标表数据创建索引。  According to the data filtering condition, the target table data is selected to obtain table data satisfying the data filtering condition, and an index is created for the target table data satisfying the data filtering condition.
可选的, 根据所述数据过滤条件, 对目标表数据进行 选得到满足所 述数据过滤条件的表数据, 并对满足所述数据过滤条件的表数据创建索引, 包括: 读取数据库中的目标表数据; Optionally, the target table data is selected according to the data filtering condition to obtain table data that meets the data filtering condition, and an index is created for the table data that meets the data filtering condition, including: Read the target table data in the database;
确定所述目标表数据满足所述数据过滤条件时, 对所述满足所述数据 过滤条件的目标表数据进行计算, 得到索引数据;  When it is determined that the target table data satisfies the data filtering condition, the target table data that satisfies the data filtering condition is calculated to obtain index data;
将所述索引数据写入索引上相应指针的位置。  The index data is written to the location of the corresponding pointer on the index.
较佳的, 所述将所述索引数据写入索引上相应指针的位置之后, 所述 方法还包括:  Preferably, after the index data is written to the position of the corresponding pointer on the index, the method further includes:
判断索引是否创建完毕, 获得判断结果; 当所述判断结果为所述索引 创建完毕时, 结束操作; 当所述判断结果为所述索引未创建完毕时, 重新 读取数据库中的目标表数据, 确定所述目标表数据满足所述数据过滤条件 时, 对满足所述数据过滤条件的目标表数据进行计算, 得到索引数据; 将 所述索引数据写入索引上相应指针的位置。  Determining whether the index is created, and obtaining the judgment result; when the judgment result is that the index is created, ending the operation; when the judgment result is that the index is not created, re-reading the target table data in the database, When it is determined that the target table data satisfies the data filtering condition, the target table data that satisfies the data filtering condition is calculated to obtain index data; and the index data is written into a position of the corresponding pointer on the index.
较佳的, 所述获取数据过滤条件, 为: 通过创建索引的 SQL命令获取 数据过滤条件。  Preferably, the obtaining the data filtering condition is: obtaining a data filtering condition by using an SQL command for creating an index.
较佳的, 所述数据为大数据时, 所述数据过滤条件包括: 从依据数据 生命周期对表数据所做出的分组中, 选择用来创建索引的目标组所依据的 过滤条件。  Preferably, when the data is big data, the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, a filtering condition according to the target group used to create the index.
较佳的, 所述数据过滤条件通过 where语句实现, 或通过包含范围单 词的语句实现, 或通过包含列表单词的语句实现。  Preferably, the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list word.
本发明实施例还提供一种了数据索引创建装置, 包括:  The embodiment of the invention further provides a data index creation device, including:
数据过滤条件获取模块, 配置为获取数据过滤条件;  a data filtering condition obtaining module configured to obtain a data filtering condition;
索引创建模块, 配置为根据所述数据过滤条件获取模块获取的所述数 据过滤条件, 对目标表数据进行 选得到满足过滤条件的表数据, 并对满 足所述数据过滤条件的表数据创建索引。  The index creation module is configured to select, according to the data filtering condition acquired by the data filtering condition acquisition module, the table data that meets the filtering condition, and create an index on the table data that satisfies the data filtering condition.
可选的, 所述索引创建模块包括:  Optionally, the index creation module includes:
表数据读取单元, 配置为读取数据库中的目标表数据; 数据过滤条件判断单元, 配置为判断所述表数据读取单元读取的所述 目标表数据是否满足所述数据过滤条件, 并在所述目标表数据满足所述数 据过滤条件时触发计算单元; a table data reading unit configured to read target table data in the database; The data filtering condition determining unit is configured to determine whether the target table data read by the table data reading unit satisfies the data filtering condition, and trigger a computing unit when the target table data satisfies the data filtering condition;
计算单元, 配置为对满足所述数据过滤条件的目标表数据进行计算, 得到索引数据;  a calculating unit configured to calculate target data that meets the data filtering condition to obtain index data;
索引数据写入单元, 配置为将所述计算单元得到的所述索引数据写入 索引上相应指针的位置。  The index data writing unit is configured to write the index data obtained by the calculating unit to a position of a corresponding pointer on the index.
可选的, 所述索引创建模块还包括数据判别单元, 配置为在所述索引 数据写入单元将所述索引数据写入索引上相应指针的位置后, 判断数据库 中是否存在未读取的表数据, 并在所述索引未创建完毕的情况下, 触发所 述表数据读取单元继续读取数据库中的目标表数据。  Optionally, the index creation module further includes a data discriminating unit configured to determine whether an unread table exists in the database after the index data writing unit writes the index data to a position of a corresponding pointer on the index. Data, and if the index is not created, triggering the table data reading unit to continue reading the target table data in the database.
较佳的, 所述数据过滤条件获取模块, 配置为通过创建索引的 SQL命 令获取数据过滤条件。  Preferably, the data filtering condition obtaining module is configured to acquire a data filtering condition by using an indexed SQL command.
较佳的, 所述数据为大数据, 所述数据过滤条件包括: 从依据数据生 命周期对表数据所做出的分组中, 选择用来创建索引的目标组所依据的选 择条件。  Preferably, the data is big data, and the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, a selection condition according to the target group used to create the index.
较佳的, 所述数据过滤条件通过 where语句实现, 或通过包含范围单 词的语句实现, 或通过包含列表单词的语句实现。  Preferably, the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list word.
本发明实施例还提供了一种计算机存储介质, 所述计算机存储介质中 存储有计算机可执行指令, 所述计算机可执行指令用于执行本发明实施例 所述的数据索引创建方法。  The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
从上面所述可以看出, 通过本发明实施例提供的数据索引创建方法, 在创建索引时根据一定的数据过滤条件对目标表数据进行过滤, 将不满足 数据过滤条件的目标表数据滤除, 只针对数据库中的某一部分数据创建索 引, 从而索引中的数据量将大程度减少。 同时经过过滤, 基本不被使用的 目标表数据滤除在索引之外, 这样使得索引的使用效率得到了显著的提高。 此外, 由于过滤了冗余的表数据, 用户通过索引访问数据库的速度得 到提升。 索引的维护成本也得到降低, 索引缩小之后, 增加和删除数据的 操作变得简便; 创建索引的速度得到提高, 难度降低, 时间减少, 使得用 户更容易根据其需求创建新的索引。 尤其是在对大数据、 海量数据创建索 引时, 本发明实施例所提供的方法能够更加迅速地形成索引。 As can be seen from the above, the data index creation method provided by the embodiment of the present invention filters the target table data according to certain data filtering conditions when the index is created, and filters the target table data that does not satisfy the data filtering condition. Indexes are only created for a portion of the data in the database, so the amount of data in the index is greatly reduced. At the same time filtered, basically not used The target table data is filtered out of the index, which makes the index usage efficiency significantly improved. In addition, because of the filtering of redundant table data, the speed at which users access the database through the index is improved. The maintenance cost of the index is also reduced. After the index is reduced, the operation of adding and deleting data becomes simple; the speed of creating the index is improved, the difficulty is reduced, and the time is reduced, making it easier for the user to create a new index according to his needs. In particular, when an index is created for big data or massive data, the method provided by the embodiments of the present invention can form an index more quickly.
此外, 本发明实施例的数据索引创建方法, 允许数据库研发人员根据 实际需要, 依据数据生命周期对数据库进行管理: 数据库研发人员根据本 发明实施例, 可首先通过用户需求, 确定索引的筛选条件, 建立相应的索 引; 然后根据数据生命周期对索引进行维护。 在创建索引时, 也不再需要 将冷热数据从表中分开, 也没有数据迁移, 数据库设计将更加合理。  In addition, the data index creation method of the embodiment of the present invention allows the database developer to manage the database according to the actual life and according to the data life cycle: According to the embodiment of the present invention, the database developer can first determine the screening condition of the index through the user requirement. Establish a corresponding index; then maintain the index according to the data life cycle. When creating an index, there is no longer a need to separate hot and cold data from the table, nor data migration, and the database design will be more reasonable.
本发明实施例所进一步提供的数据索引创建装置, 创建的索引数据量 小、 维护成本低、 查询命令执行效率高。 此外, 更有利于对数据库进行合 理的设计和管理。 附图说明  The data index creation device further provided by the embodiment of the present invention has a small amount of index data, low maintenance cost, and high query query execution efficiency. In addition, it is more conducive to the rational design and management of the database. DRAWINGS
图 1为本发明实施例的数据索引创建方法流程图;  FIG. 1 is a flowchart of a method for creating a data index according to an embodiment of the present invention;
图 2 为本发明实施例中根据所述数据过滤条件, 对目标表数据进行筛 选得到满足数据过滤条件的表数据, 并对所述满足数据过滤条件的目标表 数据创建索引的步骤的一种实施方式流程图;  2 is an implementation of the step of filtering the target table data according to the data filtering condition to obtain table data satisfying the data filtering condition, and creating an index for the target table data satisfying the data filtering condition according to the data filtering condition. Mode flow chart;
图 3为本发明一种优选实施例的数据索引创建方法流程图;  3 is a flowchart of a method for creating a data index according to a preferred embodiment of the present invention;
图 4为本发明一种实施例的数据索引创建装置结构示意图。 具体实施方式  FIG. 4 is a schematic structural diagram of a data index creation apparatus according to an embodiment of the present invention. detailed description
为了给出有效的实现方案, 本发明提供了以下实施例, 以下结合说明 书附图对本发明的实施例进行说明。 根据本发明实施例提供的数据索引创建方法, 包括以下步骤: 获取数据过滤条件; In order to provide an effective implementation, the present invention provides the following embodiments, and the embodiments of the present invention will be described below in conjunction with the accompanying drawings. A data index creation method according to an embodiment of the present invention includes the following steps: acquiring a data filtering condition;
根据所述数据过滤条件, 对目标表数据进行 选得到满足数据过滤条 件的表数据, 并对所述满足数据过滤条件的目标表数据创建索引。  According to the data filtering condition, the target table data is selected to obtain table data satisfying the data filtering condition, and an index is created for the target table data satisfying the data filtering condition.
图 1 为本发明实施例的数据索引创建方法流程图; 上述方法可以通过 图 1所示的流程实现, 包括如下步骤:  FIG. 1 is a flowchart of a method for creating a data index according to an embodiment of the present invention; the foregoing method may be implemented by using the process shown in FIG. 1, and includes the following steps:
步骤 101 : 获取数据过滤条件。  Step 101: Obtain data filtering conditions.
这里, 所述获取数据过滤条件, 为: 通过创建索引的 SQL命令获取数 据过滤条件。  Here, the obtaining data filtering condition is: acquiring a data filtering condition by using an SQL command for creating an index.
作为一个实施例, 所述数据过滤条件通过分析用户软件需求、 生成创 建索引的 SQL命令、再解析所述创建索引的 SQL指令获得。 所述数据过滤 条件也可以是数据库接收用户创建索引的指令、 从所述创建索引的指令中 获取数据过滤条件。  As an embodiment, the data filtering condition is obtained by analyzing user software requirements, generating an SQL command for creating an index, and resolving the SQL command for creating the index. The data filtering condition may also be that the database receives an instruction for the user to create an index, and obtains a data filtering condition from the instruction for creating the index.
步骤 102: 根据所述数据过滤条件,对目标表数据进行筛选得到满足所 述数据过滤条件的表数据, 并对满足所述数据过滤条件的表数据创建索引。  Step 102: Filter the target table data according to the data filtering condition to obtain table data that satisfies the data filtering condition, and create an index on the table data that satisfies the data filtering condition.
在具体实施例中, 可以逐一判断目标表数据是否满足所述数据过滤条 件, 并在确定某目标表数据满足所述数据过滤条件时, 对所述目标表数据 进行计算, 得到索引数据; 最后所有满足数据过滤条件目标表数据计算得 到的索引数据组成完整的索引。  In a specific embodiment, the target table data may be judged one by one whether the data filtering condition is met, and when it is determined that the target table data satisfies the data filtering condition, the target table data is calculated to obtain index data; The index data calculated by satisfying the data filter condition target table data constitutes a complete index.
所述数据过滤条件, 可以是对目标表数据内容进行 选的一个范围条 件, 例如, 对名称、 属性、 大小、 记录时间等等进行筛选的范围条件。  The data filtering condition may be a range condition for selecting the data content of the target table, for example, a range condition for filtering the name, the attribute, the size, the recording time, and the like.
在另外一种具体实施例中, 可以先将满足所述数据过滤条件的所有目 标表数据 选出来, 再逐一对满足所述数据过滤条件的目标表数据进行计 算, 最终创建完整的索引。  In another specific embodiment, all the target table data satisfying the data filtering condition may be selected first, and then the target table data satisfying the data filtering condition may be calculated one by one, and finally a complete index is created.
通过本发明实施例提供的数据索引创建方法, 在创建索引时根据一定 的数据过滤条件对目标表数据进行过滤, 将不满足数据过滤条件的目标表 数据滤除, 只针对数据库中的某一部分数据创建索引, 从而索引中的数据 量将大程度减少。 同时可根据使用需要对数据进行过滤, 这样使得索引的 使用效率得到了显著的提高。 尤其是当需要创建索引的数据为大数据时, 本发明所提供的方法在创建效率等方面的优势更为显著。 The method for creating a data index provided by the embodiment of the present invention is based on a certain The data filtering condition filters the target table data, filters the target table data that does not satisfy the data filtering condition, and creates an index only for a certain part of the data in the database, so that the amount of data in the index is greatly reduced. At the same time, the data can be filtered according to the needs of use, which makes the use efficiency of the index significantly improved. Especially when the data that needs to be indexed is big data, the method provided by the present invention has more advantages in terms of creation efficiency and the like.
此外, 由于过滤了冗余的表数据, 用户通过索引访问数据库的速度得 到提升。 索引的维护成本也得到降低, 索引缩小之后, 增加和删除数据的 操作变得简便; 创建索引的速度得到提高, 难度降低, 时间减少, 使得用 户更容易根据其需求创建新的索引。  In addition, because of the redundant table data being filtered, the speed at which users access the database through the index is improved. The maintenance cost of the index is also reduced. After the index is reduced, the operation of adding and deleting data becomes simple; the speed of creating the index is improved, the difficulty is reduced, and the time is reduced, making it easier for the user to create a new index according to his needs.
若在一种具体实施例中, 所述数据过滤条件通过分析用户以 SQL方式 发送的创建指令获得, 则上述步骤 101可包括:  In a specific embodiment, the data filtering condition is obtained by analyzing a creation instruction sent by the user in a SQL manner, and the foregoing step 101 may include:
接收用户创建索引的 SQL指令;  Receiving a SQL instruction that the user creates an index;
解析所述创建索引的 SQL指令,从所述创建索引的 SQL指令中获取数 据过滤条件。  Parsing the SQL instruction that creates the index, and obtaining a data filtering condition from the SQL instruction that creates the index.
图 2 为本发明实施例中根据所述数据过滤条件, 对目标表数据进行筛 选得到满足数据过滤条件的表数据, 并对所述满足数据过滤条件的目标表 数据创建索引的步骤的一种实施方式流程图; 结合图 1, 在一种具体实施例 中, 上述步骤 102还可以通过如图 2所示的流程实现, 包括如下步骤: 步骤 201 : 读取数据库中的目标表数据。  2 is an implementation of the step of filtering the target table data according to the data filtering condition to obtain table data satisfying the data filtering condition, and creating an index for the target table data satisfying the data filtering condition according to the data filtering condition. Referring to FIG. 1, in a specific embodiment, the foregoing step 102 can also be implemented by the process shown in FIG. 2, including the following steps: Step 201: Read target table data in the database.
作为一个实施例, 所述目标表数据, 是指数据库中与索引相关的数据 表中的表数据。  As an embodiment, the target table data refers to table data in a data table related to an index in a database.
步骤 202: 判断所述目标表数据是否满足数据过滤条件, 当判断的结果 为是时, 执行步骤 203; 当判断的结果为否时, 执行步骤 205。  Step 202: Determine whether the target table data satisfies the data filtering condition. When the result of the determination is yes, step 203 is performed; when the result of the determination is no, step 205 is performed.
作为一个实施例, 在判断所述目标表数据是否满足所述数据过滤条件 时, 若判断不满足所述数据过滤条件, 则不需要对该目标表数据创建索引。 步骤 203: 对所述满足过滤条件的表数据进行计算, 得到索引数据。 步骤 204: 将所述索引数据写入索引上相应指针的位置。 As an embodiment, when it is determined whether the target table data satisfies the data filtering condition, if it is determined that the data filtering condition is not satisfied, it is not required to create an index on the target table data. Step 203: Perform calculation on the table data that satisfies the filtering condition to obtain index data. Step 204: Write the index data to the position of the corresponding pointer on the index.
步骤 205: 判断索引是否创建完毕; 当判断的结果为是时, 结束操作; 当判断的结果为否时, 则返回步骤 201。  Step 205: Determine whether the index is created. When the result of the determination is yes, the operation ends; when the result of the determination is no, the process returns to step 201.
这里, 在对目标数据表中的满足所述数据过滤条件的目标表数据先后 依次计算索引数据的情况下, 每次计算之后, 若判断目标数据表中还存在 剩余的、 未读取的数据, 则可得知索引还未创建完毕。 所述目标数据表在 本实施例中指目标表数据所对应的数据表。 在步骤 205 中若判断所述索引 已经创建完毕, 则结束操作。  Here, in the case where the index data is sequentially calculated for the target table data satisfying the data filtering condition in the target data table, after each calculation, if it is determined that there is remaining, unread data in the target data table, You can see that the index has not been created yet. In the present embodiment, the target data table refers to a data table corresponding to the target table data. If it is determined in step 205 that the index has been created, the operation ends.
参照图 1和图 2, 在一些实施例中, 为了实现步骤 102, 应当执行至少 包含有步骤 201至步骤 204的一个流程。  Referring to Figures 1 and 2, in some embodiments, to implement step 102, a flow including at least steps 201 through 204 should be performed.
在一些实施例中, 所述数据过滤条件包括用户创建索引命令中所包含 的数据过滤条件。 具体的, 所述数据过滤条件包括: 从用户创建索引命令 所对应的 SQL语句中获得数据过滤条件。  In some embodiments, the data filtering condition includes a data filtering condition included in a user create index command. Specifically, the data filtering condition includes: obtaining a data filtering condition from a SQL statement corresponding to the user creating an index command.
具体的, 所述数据过滤条件包括反映用户操作需求的数据过滤条件; 所述数据过滤条件可通过分析用户对所述目标数据表创建索引时可能的釆 用操作,从操作所对应的 SQL语句中获得,删除不同 SQL语句中所包含的 相同的数据过滤条件, 得到创建索引所需要的多个不同的或一个数据过滤 条件。  Specifically, the data filtering condition includes a data filtering condition that reflects a user operation requirement; and the data filtering condition may be used by analyzing a possible operation when the user creates an index on the target data table, from the SQL statement corresponding to the operation. Obtain and delete the same data filtering conditions contained in different SQL statements to obtain multiple different or one data filtering conditions required to create an index.
在现实生活中, 数据具有一定的生命周期, 通常情况下, 用户对较新 的数据操作更频繁, 对历史越久远的数据操作越少。 在现有数据库技术条 件下, 为了满足执行效率需求, 数据库研发人员将同一张表分为在线表和 历史表; 所述在线表保存最近的少量数据, 其余数据保存在所述历史表中。 从所述在线表到所述历史表的数据迁移工作具有很大工作量, 整个过程消 耗大量系统输入输出 ( 10 )和 CPU。 针对于此, 本发明的一些实施例中, 当所述目标表数据为大数据时, 所述数据过滤条件包括: 从依据数据生命周期对表数据所做出的分组中, 选择用来创建索引的目标组所依据的选择条件。 In real life, data has a certain life cycle. Usually, users operate on newer data more frequently, and the data operations that are older are less. Under the existing database technology conditions, in order to meet the execution efficiency requirements, the database developer divides the same table into an online table and a history table; the online table stores the latest small amount of data, and the remaining data is stored in the history table. The data migration from the online table to the history table has a large workload, and the entire process consumes a large amount of system input and output (10) and CPU. In this regard, in some embodiments of the present invention, when the target table data is big data, the data filtering condition includes: selecting, from the grouping of the table data according to the data life cycle, the index used to create the index The selection criteria on which the target group is based.
研发人员研发时, 依据数据生命周期, 对大数据目标数据表中的数据 进行分组; 然后本实施例首先根据数据过滤条件, 从所述数据分组中选取 至少一个分组作为目标组, 得到满足数据过滤条件的表数据。 所述数据过 滤条件包括从所述至少一个分组中选择目标组所依据的过滤条件。  When the R&D personnel develops, the data in the big data target data table is grouped according to the data life cycle; then, in this embodiment, at least one group is selected from the data group as the target group according to the data filtering condition, and the data filtering is satisfied. Table data for conditions. The data filtering condition includes a filtering condition upon which a target group is selected from the at least one group.
具体的, 在对所述至少一个目标数据表中的数据进行分组时, 可通过 分析数据表结构, 找到可作为时间轴的字段, 将这些字段组成时间轴, 根 据时间轴获得表数据在生命周期中所处的阶段, 对数据库中的表数据进行 划分。 更具体的, 可按 "天" 为单位划分。  Specifically, when grouping the data in the at least one target data table, the data table structure can be analyzed to find a field that can be used as a time axis, and the fields are formed into a time axis, and the table data is obtained according to the time axis in the life cycle. In the stage, the table data in the database is divided. More specifically, it can be divided into "days".
将大数据表分组后, 即可根据分析用户操作需求所得到的数据过滤条 件, 选取其中至少一个分组作为目标组, 对目标组创建索引, 所述目标组 的选择条件包含在创建索引的数据过滤条件中。 将这样的数据过滤条件加 入创建索引的 SQL语句中, 最后根据所述创建索引的 SQL语句对目标表 数据创建索引。  After grouping the big data tables, according to the data filtering conditions obtained by analyzing the user operation requirements, at least one of the groups is selected as the target group, and an index is created for the target group, and the selection condition of the target group includes data filtering in creating the index. In the condition. Such data filtering conditions are added to the SQL statement that creates the index, and finally the index of the target table data is created according to the SQL statement that creates the index.
现有技术中所釆用的数据分组方式, 往往会将同一个时间阶段的数据 分开, 例如, 将热点数据和非热点数据分开, 分别创建索引。 然而同一时 间阶段的数据索引结构基本相同, 例如热点数据和非热点数据的索引结构 基本相同, 釆用本发明上述实施例所提供的方法, 可以从依据数据生命周 期对数据库中的数据所做出的分组中选择目标组, 得到满足过滤条件的表 数据, 再建立对应的索引, 有利于对索引结构相同的不同类型数据创建统 一的索引, 进而在处理大数据的过程中具有更高的效率。  The data grouping method used in the prior art often separates data in the same time period. For example, the hotspot data and the non-hotspot data are separated, and an index is separately created. However, the data index structure of the same time phase is basically the same, for example, the index structure of the hotspot data and the non-hotspot data is substantially the same, and the method provided by the above embodiment of the present invention can be made from the data in the database according to the data life cycle. Selecting the target group in the grouping, obtaining the table data satisfying the filtering condition, and establishing the corresponding index, is advantageous for creating a unified index for different types of data with the same index structure, and thus has higher efficiency in the process of processing big data.
上述分析用户需求可以是分析用户访问数据库时, 相应的 SQL查询语 句中的查询条件。 釆用本发明实施例所提供的方法, 数据库研发人员根据实际需要, 通 过用户需求, 确定索引的筛选条件, 建立相应的索引, 不再需要将冷热数 据从表中分开, 也没有数据迁移, 数据库设计将更加合理。 The above analysis of the user requirements may be an analysis of the query conditions in the corresponding SQL query statement when the user accesses the database. According to the method provided by the embodiment of the present invention, the database R&D personnel determines the screening condition of the index according to the actual needs, establishes the corresponding index, and eliminates the need to separate the hot and cold data from the table, and no data migration. Database design will be more reasonable.
不同时间范围内的数据, 用户具有不同的操作需求, 需要对索引进行 相应修改。 针对于此, 在本发明的另一些实施例中, 本发明实施例提供的 数据索引创建方法, 还包括在表数据生命周期的不同阶段对索引进行管理: 依据目标表数据在所述数据生命周期对应的阶段变动, 对所述索引进行更 改。 具体的, 例如, 状态字段上建立的索引往往是用于查询操作状态和更 新的; 若历史数据没有更新, 就不再需要状态字段上的索引, 需要将相应 的索引数据删除。 再如, 如果用户不再关心一年以上的超期数据, 需要删 除一年范围之前的索引。  For data in different time ranges, users have different operational requirements and need to modify the index accordingly. In this regard, in other embodiments of the present invention, the data index creation method provided by the embodiment of the present invention further includes: managing the index at different stages of the table data life cycle: according to the target table data in the data life cycle The corresponding phase changes, the index is changed. Specifically, for example, the index established on the status field is often used to query the operation status and update; if the historical data is not updated, the index on the status field is no longer needed, and the corresponding index data needs to be deleted. For another example, if the user no longer cares about the overdue data for more than one year, the index before the one-year range needs to be deleted.
在一些实施例中, 所述数据过滤条件通过 where语句实现。  In some embodiments, the data filtering condition is implemented by a where statement.
在当前 SQL语法中, where是数据库中的一个指令, 用于规定选择的 条件, 应用于选择(select ), 删除(delete ), 更新(update )语句中。 本发 明实施例在创建索引的语句中, 数据过滤条件通过 where语句实现; 也就 是说, 通过执行相应的 where语句, 对目标表数据进行过滤。  In the current SQL syntax, where is an instruction in the database that specifies the conditions for selection, and is applied to select (select), delete (delete), and update (update) statements. In the embodiment of the present invention, in the statement for creating an index, the data filtering condition is implemented by the where statement; that is, the target table data is filtered by executing the corresponding where statement.
更具体的,所述 where语句包括 where指令和条件表达式。所述条件表 达式可以由列、 常量和运算符等组合而成。  More specifically, the where statement includes a where instruction and a conditional expression. The conditional expression can be a combination of columns, constants, and operators.
优选地, 所述 where语句可以设置于当前 SQL创建索引语句的末端。 为了避免数据库研发人员滥用 where语句带来的负面效果, 在本发明 的一些实施例中, 所述数据过滤条件通过包含范围单词的语句实现。 所述 范围单词可以是 range指令、 或 before指令等。  Preferably, the where statement can be set at the end of the current SQL creation index statement. In order to avoid the negative effects of the use of the where statement by the database developer, in some embodiments of the invention, the data filtering condition is implemented by a statement containing a range of words. The range word may be a range instruction, a before instruction, or the like.
为了避免数据库研发人员滥用 where语句带来的负面效果, 在本发明 的一些实施例中, 所述数据过滤条件还通过包含列表单词的语句实现。 所 述列表单词可以是 list指令。 本发明实施例还提供了一种计算机存储介质, 所述计算机存储介质中 存储有计算机可执行指令, 所述计算机可执行指令用于执行本发明实施例 所述的数据索引创建方法。 In order to avoid the negative effects of the use of the where statement by the database developer, in some embodiments of the invention, the data filtering condition is also implemented by a statement containing a list of words. The list word can be a list instruction. The embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
图 3 为本发明一种优选实施例的数据索引创建方法流程图; 在本发明 的一种优选实施例中, 所述数据索引创建方法包括如图 3所示的步骤: 步骤 301 : 接收用户创建索引的指令。  3 is a flowchart of a data index creation method according to a preferred embodiment of the present invention. In a preferred embodiment of the present invention, the data index creation method includes the steps shown in FIG. 3: Step 301: Receive user creation Indexed instructions.
这里, 所述创建索引的指令可以是创建索引的 SQL指令。  Here, the instruction to create an index may be an SQL instruction that creates an index.
步骤 302: 从所述创建索引的指令中获取数据过滤条件。  Step 302: Obtain a data filtering condition from the instruction for creating an index.
步骤 303 : 读取目标表数据。 作为一个实施例, 所述目标表数据指创建 索引的指令所对应的数据库中的表数据。  Step 303: Read the target table data. As an embodiment, the target table data refers to table data in a database corresponding to an instruction to create an index.
步骤 304: 判断所述目标表数据是否满足所述数据过滤条件; 当判断的 结果为是时, 执行步骤 305 ; 当判断的结果为否时, 则执行步骤 307。  Step 304: Determine whether the target table data satisfies the data filtering condition; when the result of the determination is yes, execute step 305; when the result of the determination is no, perform step 307.
步骤 305: 对目标表数据进行计算, 得到索引数据。  Step 305: Calculate the target table data to obtain index data.
步骤 306: 将所述索引数据写入索引上相应的指针位置。  Step 306: Write the index data to a corresponding pointer position on the index.
步骤 307: 判断数据库中是否存在未读取的目标表数据; 当判断的结果 为是时, 结束操作流程; 当判断的结果为否时, 返回步骤 303。  Step 307: Determine whether there is unread target table data in the database; when the result of the determination is yes, the operation flow is ended; when the result of the determination is no, the process returns to step 303.
本发明实施例还提供了一种计算机存储介质, 所述计算机存储介质中 存储有计算机可执行指令, 所述计算机可执行指令用于执行本发明实施例 所述的数据索引创建方法。  The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to the embodiment of the present invention.
本发明实施例还提供了一种数据索引创建装置, 图 4 为本发明一种实 施例的数据索引创建装置结构示意图; 如图 4所示, 所述装置包括:  The embodiment of the present invention further provides a data index creation device, and FIG. 4 is a schematic structural diagram of a data index creation device according to an embodiment of the present invention; as shown in FIG. 4, the device includes:
数据过滤条件获取模块 41, 配置为获取数据过滤条件;  The data filtering condition obtaining module 41 is configured to obtain a data filtering condition;
索引创建模块 42,配置为根据所述数据过滤条件获取模块 41获取的所 述数据过滤条件, 对目标表数据进行 选得到满足所述数据过滤条件的表 数据, 并对满足所述数据过滤条件的表数据创建索引。 本发明实施例提供的数据索引创建装置, 在创建索引时根据一定的数 据过滤条件对目标表数据进行过滤, 将不满足所述数据过滤条件的目标表 数据滤除, 只依据数据过滤条件对数据库的目标表数据进行过滤, 得到满 足所述数据过滤条件的表数据, 针对满足所述数据过滤条件的表数据创建 索引, 从而索引中的数据量将大程度减少。 满足数据过滤条件的数据所占 目标表数据量比率越小, 本发明实施例提供的索引创建装置提高索引创建 的速度效果越明显; 尤其面临海量数据时, 本发明提供的装置能够以更快 的速度创建出索引。 同时, 根据过滤条件创建的索引, 整个索引的使用效 率也得到显著提高。 The index creation module 42 is configured to select, according to the data filtering condition acquired by the data filtering condition obtaining module 41, the target table data to obtain table data that satisfies the data filtering condition, and satisfy the data filtering condition. The table data creates an index. The data index creation device provided by the embodiment of the present invention filters the target table data according to certain data filtering conditions when the index is created, and filters the target table data that does not satisfy the data filtering condition, and only uses the data filtering condition to the database. The target table data is filtered to obtain table data satisfying the data filtering condition, and an index is created for the table data satisfying the data filtering condition, so that the amount of data in the index is greatly reduced. The smaller the target data volume ratio of the data that satisfies the data filtering condition, the more obvious the index creation device provided by the embodiment of the present invention improves the speed of index creation; especially when facing massive data, the device provided by the present invention can be faster. Speed creates an index. At the same time, according to the index created by the filter conditions, the efficiency of the entire index is also significantly improved.
所述数据过滤条件可以通过分析用户以 SQL 方式发送的创建指令获 得, 也可以通过分析用户操作需求而获得。  The data filtering condition can be obtained by analyzing a creation instruction sent by the user in SQL, or by analyzing user operation requirements.
若所述数据过滤条件通过分析用户以 SQL方式发送的创建指令获得, 则所述数据过滤条件获取模块 41可进一步包括:  If the data filtering condition is obtained by analyzing the creation instruction sent by the user in the SQL manner, the data filtering condition obtaining module 41 may further include:
创建指令接收单元 411, 配置为接收用户创建索引的指令;  The instruction receiving unit 411 is configured to receive an instruction for the user to create an index;
创建指令分析单元 412,配置为从所述创建指令接收单元 411接收的所 述创建索引的指令中获取数据过滤条件。  The create instruction analysis unit 412 is configured to acquire a data filter condition from the instruction to create the index received by the create instruction receiving unit 411.
参照图 4, 在一些实施例中, 所述索引创建模块 42进一步包括: 表数据读取单元 421, 配置为读取数据库中的目标表数据;  Referring to FIG. 4, in some embodiments, the index creation module 42 further includes: a table data reading unit 421 configured to read target table data in the database;
数据过滤条件判断单元 422,配置为判断所述表数据读取单元读取的所 述目标表数据是否满足所述数据过滤条件, 并在所述目标表数据满足所述 数据过滤条件时触发计算单元 423 ;  The data filtering condition determining unit 422 is configured to determine whether the target table data read by the table data reading unit satisfies the data filtering condition, and trigger the calculating unit when the target table data satisfies the data filtering condition 423 ;
计算单元 423, 配置为对满足所述数据过滤条件的目标表数据进行计 算, 得到索引数据;  The calculating unit 423 is configured to calculate target data that meets the data filtering condition to obtain index data.
索引数据写入单元 424,配置为将所述计算单元 423得到的所述索引数 据写入索引上相应指针的位置。 参照图 4, 所述索引创建模块 42还包括数据判别单元 425, 对于目标 表数据分批处理的情况, 配置为在所述索引数据写入单元 424将所述索引 数据写入索引上相应指针的位置后, 判断整个索引是否创建完毕, 并在整 个索引未创建完毕的情况下, 触发所述表数据读取单元 421 继续读取数据 库中的目标表数据。 The index data writing unit 424 is configured to write the index data obtained by the calculating unit 423 to the position of the corresponding pointer on the index. Referring to FIG. 4, the index creation module 42 further includes a data discriminating unit 425 configured to, in the case of batch processing of the target table data, write the index data to the corresponding pointer on the index at the index data writing unit 424. After the location, it is judged whether the entire index is created, and if the entire index is not created, the table data reading unit 421 is triggered to continue to read the target table data in the database.
在一些实施例中, 所述数据过滤条件可以包括用户创建索引命令中所 包含的数据过滤条件。  In some embodiments, the data filtering condition may include a data filtering condition included in a user create index command.
在另一些实施例中, 所述数据过滤条件可以从分析用户操作需求所得 到创建索引指令中获得。 具体的, 所述数据过滤条件获取模块 41, 配置为 通过创建索引的 SQL命令获取数据过滤条件。  In other embodiments, the data filtering conditions can be obtained from analyzing user operational requirements to creating indexing instructions. Specifically, the data filtering condition obtaining module 41 is configured to obtain a data filtering condition by using an SQL command that creates an index.
具体的, 所述数据过滤条件包括反映用户操作需求的数据过滤条件; 所述数据过滤条件可通过分析用户对该目标表数据可能的查询、 更新、 删 除操作,从操作所对应的 SQL语句中获得,删除不同 SQL语句中所包含的 相同的数据过滤条件, 得到用户对该表操作的主要过滤条件, 然后综合考 虑性能等各方面应属后, 最后得到创建索引所需要的多个不同的或一个数 据过滤条件。  Specifically, the data filtering condition includes a data filtering condition that reflects a user operation requirement; the data filtering condition may be obtained by analyzing a possible query, update, and delete operation of the target table data by the user, and obtaining the SQL statement corresponding to the operation. Delete the same data filtering conditions contained in different SQL statements, obtain the main filtering conditions for the user to operate on the table, and then comprehensively consider the performance and other aspects, and finally get multiple different or ones needed to create the index. Data filtering conditions.
在一些实施例中, 所述目标表数据为数据表中的所有数据。  In some embodiments, the target table data is all data in the data table.
在一些实施例中, 所述数据为大数据, 数据过滤条件包括: 从依据数 据生命周期对表数据所做出的分组中, 选择用来创建索引的目标组所依据 的选择条件。  In some embodiments, the data is big data, and the data filtering conditions include: selecting, from the grouping of the table data according to the data life cycle, a selection condition according to the target group used to create the index.
研发人员研发时, 依据数据生命周期, 对大数据目标数据表中的数据 进行分组; 然后本实施例首先根据数据过滤条件, 从所述数据分组中选取 至少一个分组作为目标组, 得到满足数据过滤条件的表数据。 所述数据过 滤条件包括从所述至少一个分组中选择目标组所依据的过滤条件。  When the R&D personnel develops, the data in the big data target data table is grouped according to the data life cycle; then, in this embodiment, at least one group is selected from the data group as the target group according to the data filtering condition, and the data filtering is satisfied. Table data for conditions. The data filtering condition includes a filtering condition upon which a target group is selected from the at least one group.
具体的, 在对所述大数据表中的数据进行分组时, 可根据分析用户操 作需求所得到的数据分组条件, 分析数据表结构, 找到可作为时间轴的字 段, 将这些字段组成时间轴, 根据时间轴获得表数据在生命周期中所处的 阶段, 对数据库中的表数据进行划分为多个分组。 更具体的, 可按天为单 位划分。 Specifically, when grouping the data in the big data table, according to analyzing the user operation The data grouping conditions obtained by the requirements, analyze the data table structure, find the fields that can be used as the time axis, form these fields into the time axis, obtain the stage of the table data in the life cycle according to the time axis, and the table data in the database. Divided into multiple groups. More specifically, it can be divided by day.
将大数据分组后, 即可根据分析用户操作需求所得到的数据过滤条件, 选取其中一个或多个分组作为目标组, 所述目标组的选择条件是所述数据 过滤条件。 将这样的数据过滤条件加入创建索引的 SQL语句中, 最后根据 所述创建索引的 SQL语句对目标表数据创建索引。  After grouping the big data, one or more of the groupings may be selected as the target group according to the data filtering conditions obtained by analyzing the user operation requirements, and the selection condition of the target group is the data filtering condition. Such data filtering conditions are added to the SQL statement that creates the index, and finally the index of the target table data is created according to the SQL statement that creates the index.
在一些实施例中, 所述数据过滤条件通过 where语句实现, 或通过包 含范围单词的语句实现, 或通过包含列表单词的语句实现。  In some embodiments, the data filtering condition is implemented by a where statement, or by a statement containing a range of words, or by a statement containing a list of words.
在本发明的一种优选实施例中, 所述大数据索引装置包括数据过滤条 件获取模块和索引创建模块。  In a preferred embodiment of the present invention, the big data indexing device includes a data filtering condition acquisition module and an index creation module.
所述数据过滤条件获取模块 41, 配置为获取数据过滤条件; 所述数据 过滤条件获取模块 41包括:  The data filtering condition obtaining module 41 is configured to acquire a data filtering condition; the data filtering condition obtaining module 41 includes:
创建指令接收单元, 配置为接收用户创建索引的指令;  Creating an instruction receiving unit configured to receive an instruction for the user to create an index;
创建指令分析单元, 配置为从所述创建指令接收单元 411 接收的所述 创建索引的指令中获取数据过滤条件。  The creation instruction analysis unit is configured to acquire a data filter condition from the instruction to create an index received by the creation instruction receiving unit 411.
所述索引创建模块 42, 配置为根据所述数据过滤条件, 对目标表数据 进行 选得到满足过滤条件的表数据, 并对满足数据过滤条件的表数据创 建索引; 所述索引创建模块 42包括:  The index creation module 42 is configured to: select, according to the data filtering condition, the target table data to obtain the table data that meets the filtering condition, and create an index on the table data that meets the data filtering condition; the index creating module 42 includes:
表数据读取单元 421, 配置为读取目标表数据; 作为一个实施例, 所述 目标表数据指创建索引的指令所对应的数据库中的表数据;  The table data reading unit 421 is configured to read the target table data; as an embodiment, the target table data refers to table data in a database corresponding to the instruction for creating an index;
数据过滤条件判断单元 422,配置为判断所述表数据读取单元 421读取 的所述目标表数据是否满足所述数据过滤条件, 并在所述目标表数据满足 所述数据过滤条件时触发计算单元 423 ; 计算单元 423, 配置为对满足所述数据过滤条件的目标表数据进行计 算, 得到索引数据; The data filtering condition determining unit 422 is configured to determine whether the target table data read by the table data reading unit 421 satisfies the data filtering condition, and trigger a calculation when the target table data satisfies the data filtering condition. Unit 423; The calculating unit 423 is configured to calculate the target table data that satisfies the data filtering condition, to obtain index data;
索引数据写入单元 424,配置为将所述计算单元 423得到的所述索引数 据写入索引上相应的指针位置;  The index data writing unit 424 is configured to write the index data obtained by the calculating unit 423 into a corresponding pointer position on the index;
数据判别单元 425,配置为在所述索引数据写入单元 424将所述索引数 据写入索引上相应指针的位置后, 判断所述索引是否创建完毕, 并在数据 库中存在未读取的目标表数据的情况下, 触发所述表数据读取单元 421 继 续读取数据库中的目标表数据。  The data discriminating unit 425 is configured to determine, after the index data writing unit 424 writes the index data to the position of the corresponding pointer on the index, whether the index is created, and an unread target table exists in the database. In the case of data, the table data reading unit 421 is triggered to continue reading the target table data in the database.
在本发明实施例中, 所述网络异常检测处理装置在实际应用中, 可由 数据库实现; 所述装置中的数据过滤条件获取模块 41及其子模块: 创建指 令接收单元 411和创建指令分析单元 412、 索引创建模块 42及其子模块: 表数据读取单元 421、 数据过滤条件判断单元 422、 计算单元 423、 索引数 据写入单元 424和数据判别单元 425, 在实际应用中, 均可由所述装置中的 中央处理器(CPU, Central Processing Unit ),数字信号处理器(DSP, Digital Signal Processor )或现场可编程门阵列 ( FPGA, Field Programmable Gate Array ) 实现。  In the embodiment of the present invention, the network abnormality detecting processing device may be implemented by a database in an actual application; the data filtering condition acquiring module 41 and its submodules in the device: a create instruction receiving unit 411 and a create command analyzing unit 412. The index creation module 42 and its sub-modules: a table data reading unit 421, a data filtering condition determining unit 422, a calculating unit 423, an index data writing unit 424, and a data discriminating unit 425, which may be used by the device in practical applications. The central processing unit (CPU), the digital signal processor (DSP) or the Field Programmable Gate Array (FPGA).
从上面所述可以看出, 本发明实施例所提供的数据索引创建装置, 创 建的索引数据量少、 使用率高, 能使得用户通过索引访问数据库的速度得 到提升; 同时还能够降低索引的维护成本, 使得索引更新更加容易。 此外, 本发明实施例所提供的数据索引创建装置, 依据大数据生命周期对数据进 行管理, 有助于提高数据库设计的合理性。  As can be seen from the above, the data index creation device provided by the embodiment of the present invention has a small amount of index data and a high usage rate, which can improve the speed of the user accessing the database through the index, and can also reduce the maintenance of the index. Cost, making index updates easier. In addition, the data index creation device provided by the embodiment of the present invention manages data according to the big data life cycle, which helps improve the rationality of the database design.
本领域内的技术人员应明白, 本发明的实施例可提供为方法、 装置、 或计算机程序产品。 因此, 本发明可釆用硬件实施例、 软件实施例、 或结 合软件和硬件方面的实施例的形式。 而且, 本发明可釆用在一个或多个其 中包含有计算机可用程序代码的计算机可用存储介质 (包括但不限于磁盘 存储器和光学存储器等)上实施的计算机程序产品的形式。 Those skilled in the art will appreciate that embodiments of the present invention can be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the present invention is applicable to one or more computer-usable storage media (including but not limited to disks) having computer usable program code embodied therein. A form of computer program product embodied on a memory and optical storage, etc.).
本发明是参照根据本发明实施例的方法、 装置、 和计算机程序产品的 流程图和 /或方框图来描述的。 应理解可由计算机程序指令实现流程图和 / 或方框图中的每一流程和 /或方框、以及流程图和 /或方框图中的流程和 /或方 框的结合。 可提供这些计算机程序指令到通用计算机、 专用计算机、 嵌入 式处理机或其他可编程数据处理设备的处理器以产生一个机器, 使得通过 程图一个流程或多个流程和 /或方框图一个方框或多个方框中指定的功能的 装置。  The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowcharts and/or block diagrams, and combinations of flow and/or blocks in the flowcharts and/or block diagrams can be implemented by computer program instructions. The computer program instructions can be provided to a processor of a general purpose computer, a special purpose computer, an embedded processor, or other programmable data processing device to produce a machine such that a process or a process and/or a block diagram of a block or A device that has multiple functions specified in the box.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理 设备以特定方式工作的计算机可读存储器中, 使得存储在该计算机可读存 储器中的指令产生包括指令装置的制造品, 该指令装置实现在流程图一个 流程或多个流程和 /或方框图一个方框或多个方框中指定的功能。  The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备 上, 使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机 实现的处理, 从而在计算机或其他可编程设备上执行的指令提供用于实现 在流程图一个流程或多个流程和 /或方框图一个方框或多个方框中指定的功 能的步骤。  These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
以上所述仅是本发明实施例的实施方式, 应当指出, 对于本技术领域 的普通技术人员来说, 在不脱离本发明实施例原理的前提下, 还可以作出 若干改进和润饰, 这些改进和润饰也应视为本发明实施例的保护范围。 工业实用性  The above is only an embodiment of the present invention. It should be noted that those skilled in the art can make some improvements and refinements without departing from the principles of the embodiments of the present invention. Retouching should also be considered as the scope of protection of the embodiments of the present invention. Industrial applicability
本发明实施例在创建索引时根据一定的数据过滤条件对目标表数据进 行过滤, 将不满足数据过滤条件的目标表数据滤除, 只针对数据库中的某 一部分数据创建索引, 从而索引中的数据量将大程度减少, 将基本不被使 用的目标表数据滤除在索引之外, 这样使得索引的使用效率得到了显著的 提高, 使得用户通过索引访问数据库的速度得到提升。 In the embodiment of the present invention, when the index is created, the target table data is filtered according to certain data filtering conditions, and the target table data that does not satisfy the data filtering condition is filtered out, and only an index is created for a part of the data in the database, so that the data in the index The amount will be reduced to a large extent and will not be basically The target table data is filtered out of the index, which makes the use efficiency of the index significantly improved, and the speed at which the user accesses the database through the index is improved.

Claims

权利要求书 claims
1、 一种数据索引创建方法, 包括如下步骤: 1. A data index creation method, including the following steps:
获取数据过滤条件; Get data filter conditions;
根据所述数据过滤条件, 对目标表数据进行 选得到满足所述数据过 滤条件的表数据, 并对满足所述数据过滤条件的表数据创建索引。 According to the data filtering conditions, the target table data is selected to obtain table data that satisfies the data filtering conditions, and an index is created for the table data that satisfies the data filtering conditions.
2、 根据权利要求 1所述的方法, 其中, 所述根据所述数据过滤条件, 对目标表数据进行 选得到满足所述数据过滤条件的表数据, 并对满足所 述数据过滤条件的表数据创建索引, 包括: 2. The method according to claim 1, wherein, according to the data filtering conditions, selecting target table data to obtain table data that satisfies the data filtering conditions, and selecting table data that satisfies the data filtering conditions. Create indexes, including:
读取数据库中的目标表数据; Read the target table data in the database;
确定所述目标表数据满足所述数据过滤条件时, 对满足所述数据过滤 条件的目标表数据进行计算, 得到索引数据; When it is determined that the target table data satisfies the data filtering conditions, calculate the target table data that satisfies the data filtering conditions to obtain index data;
将所述索引数据写入索引上相应指针的位置。 The index data is written to the location of the corresponding pointer on the index.
3、 根据权利要求 2所述的方法, 其中, 所述将所述索引数据写入索引 上相应指针的位置之后, 所述方法还包括: 3. The method according to claim 2, wherein after writing the index data to the position of the corresponding pointer on the index, the method further includes:
判断索引是否创建完毕, 获得判断结果; 当所述判断结果为所述索引 创建完毕时, 结束操作; 当所述判断结果为所述索引未创建完毕时, 重新 读取数据库中的目标表数据, 确定所述目标表数据满足所述数据过滤条件 时, 对满足所述数据过滤条件的目标表数据进行计算, 得到索引数据; 将 所述索引数据写入索引上相应指针的位置。 Determine whether the index has been created, and obtain the judgment result; when the judgment result is that the index has been created, end the operation; when the judgment result is that the index has not been created, re-read the target table data in the database, When it is determined that the target table data satisfies the data filtering condition, the target table data satisfying the data filtering condition is calculated to obtain index data; and the index data is written into the position of the corresponding pointer on the index.
4、 根据权利要求 1所述的方法, 其中, 所述获取数据过滤条件, 为: 通过创建索引的结构化查询语言 SQL命令获取数据过滤条件。 4. The method according to claim 1, wherein the obtaining data filtering conditions is: obtaining the data filtering conditions through a structured query language SQL command that creates an index.
5、 根据权利要求 4所述的方法, 其中, 所述目标表数据为大数据时, 所述数据过滤条件包括: 从依据数据生命周期对目标表数据所做出的分组 中, 选择用来创建索引的目标组所依据的选择条件。 5. The method according to claim 4, wherein when the target table data is big data, the data filtering conditions include: from the grouping of the target table data according to the data life cycle, select to create The selection criteria based on which to target the index.
6、 根据权利要求 4所述的方法, 其中, 所述数据过滤条件通过 where 语句实现, 或通过包含范围单词的语句实现, 或通过包含列表单词的语句 实现。 6. The method according to claim 4, wherein the data filtering condition passes where Statement implementation, either by statements containing range words, or by statements containing list words.
7、 一种数据索引创建装置, 包括: 7. A data index creation device, including:
数据过滤条件获取模块, 配置为获取数据过滤条件; The data filter condition acquisition module is configured to obtain data filter conditions;
索引创建模块, 配置为根据所述数据过滤条件获取模块获取的所述数 据过滤条件, 对目标表数据进行筛选得到满足所述数据过滤条件的表数据, 并对满足所述数据过滤条件的表数据创建索引。 The index creation module is configured to filter the target table data according to the data filtering conditions obtained by the data filtering condition acquisition module to obtain table data that satisfies the data filtering conditions, and select table data that satisfies the data filtering conditions. Create index.
8、 根据权利要求 7所述的装置, 其中, 所述索引创建模块包括: 表数据读取单元, 配置为读取数据库中的目标表数据; 8. The device according to claim 7, wherein the index creation module includes: a table data reading unit configured to read target table data in the database;
数据过滤条件判断单元, 配置为判断所述表数据读取单元读取的所述 目标表数据是否满足所述数据过滤条件, 并在所述目标表数据满足所述数 据过滤条件时触发计算单元; A data filtering condition judgment unit configured to judge whether the target table data read by the table data reading unit satisfies the data filtering condition, and trigger the calculation unit when the target table data satisfies the data filtering condition;
计算单元, 配置为对满足所述数据过滤条件的目标表数据进行计算, 得到索引数据; A calculation unit configured to calculate target table data that meets the data filtering conditions to obtain index data;
索引数据写入单元, 配置为将所述计算单元得到的所述索引数据写入 索引上相应指针的位置。 An index data writing unit is configured to write the index data obtained by the calculation unit into the position of the corresponding pointer on the index.
9、 根据权利要求 8所述的装置, 其中, 所述索引创建模块还包括数据 判别单元, 配置为在所述索引数据写入单元将所述索引数据写入索引上相 应指针的位置后, 判断数据库中是否存在未读取的表数据, 并在所述索引 未创建完毕的情况下, 触发所述表数据读取单元继续读取数据库中的目标 表数据。 9. The device according to claim 8, wherein the index creation module further includes a data determination unit configured to determine after the index data writing unit writes the index data to the position of the corresponding pointer on the index. Whether there is unread table data in the database, and when the index has not been created, the table data reading unit is triggered to continue reading the target table data in the database.
10、 根据权利要求 7所述的装置, 其中, 所述数据过滤条件获取模块, 配置为通过创建索引的结构化查询语言 SQL命令获取数据过滤条件。 10. The device according to claim 7, wherein the data filtering condition acquisition module is configured to obtain data filtering conditions through a structured query language SQL command that creates an index.
11、 根据权利要求 10所述的装置, 其中, 所述目标表数据为大数据, 所述数据过滤条件包括, 从依据数据生命周期对表数据所做出的分组中, 选择用来创建索引的目标组所依据的选择条件。 11. The device according to claim 10, wherein the target table data is big data, and the data filtering conditions include, from the grouping of table data according to the data life cycle, Selection criteria based on which target group to create the index.
12、根据权利要求 10所述的装置,其中,所述数据过滤条件通过 where 语句实现, 或通过包含范围单词的语句实现, 或通过包含列表单词的语句 实现。 12. The device according to claim 10, wherein the data filtering condition is implemented by a where statement, or a statement containing a range word, or a statement containing a list word.
13、 一种计算机存储介质, 所述计算机存储介质中存储有计算机可执 行指令, 所述计算机可执行指令用于执行权利要求 1至 6任一项所述的数 据索引创建方法。 13. A computer storage medium, the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data index creation method according to any one of claims 1 to 6.
PCT/CN2014/082640 2014-05-07 2014-07-21 Data index creation method and device, and computer storage medium WO2015168988A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410191061.0A CN105095255A (en) 2014-05-07 2014-05-07 Data index creating method and device
CN201410191061.0 2014-05-07

Publications (1)

Publication Number Publication Date
WO2015168988A1 true WO2015168988A1 (en) 2015-11-12

Family

ID=54392048

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/082640 WO2015168988A1 (en) 2014-05-07 2014-07-21 Data index creation method and device, and computer storage medium

Country Status (2)

Country Link
CN (1) CN105095255A (en)
WO (1) WO2015168988A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776702B (en) * 2016-11-11 2021-03-05 北京奇虎科技有限公司 Method and device for processing indexes in master-slave database system
CN108460052B (en) * 2017-02-22 2022-11-01 中兴通讯股份有限公司 Method and device for automatically creating index and database system
CN110019190B (en) * 2017-09-21 2023-05-30 阿里云计算有限公司 Method and device for creating index
CN109189328B (en) * 2018-08-02 2021-06-25 郑州云海信息技术有限公司 Index table protection method suitable for NAND Flash controller
CN109145004A (en) * 2018-08-29 2019-01-04 智慧互通科技有限公司 A kind of method and device creating database index
CN112612818B (en) * 2020-12-21 2022-04-15 贝壳找房(北京)科技有限公司 Data processing method and device, computing equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078879A1 (en) * 2010-09-27 2012-03-29 Computer Associates Think, Inc. Multi-Dataset Global Index
CN103020305A (en) * 2012-12-29 2013-04-03 天津南大通用数据技术有限公司 Effective index for two-dimensional data table, and method for creating and querying effective index
CN103092886A (en) * 2011-11-07 2013-05-08 中国移动通信集团公司 Achieving method, device and system for data query operation
CN103377210A (en) * 2012-04-19 2013-10-30 北京四维图新科技股份有限公司 Method for creating incremental navigation database and method for updating same
CN103425672A (en) * 2012-05-17 2013-12-04 阿里巴巴集团控股有限公司 Method and device for creating indexes of database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477524A (en) * 2008-12-11 2009-07-08 金蝶软件(中国)有限公司 System performance optimization method and system based on materialized view

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078879A1 (en) * 2010-09-27 2012-03-29 Computer Associates Think, Inc. Multi-Dataset Global Index
CN103092886A (en) * 2011-11-07 2013-05-08 中国移动通信集团公司 Achieving method, device and system for data query operation
CN103377210A (en) * 2012-04-19 2013-10-30 北京四维图新科技股份有限公司 Method for creating incremental navigation database and method for updating same
CN103425672A (en) * 2012-05-17 2013-12-04 阿里巴巴集团控股有限公司 Method and device for creating indexes of database
CN103020305A (en) * 2012-12-29 2013-04-03 天津南大通用数据技术有限公司 Effective index for two-dimensional data table, and method for creating and querying effective index

Also Published As

Publication number Publication date
CN105095255A (en) 2015-11-25

Similar Documents

Publication Publication Date Title
Armbrust et al. Delta lake: high-performance ACID table storage over cloud object stores
US10180946B2 (en) Consistent execution of partial queries in hybrid DBMS
US10936588B2 (en) Self-described query execution in a massively parallel SQL execution engine
US11960464B2 (en) Customer-related partitioning of journal-based storage systems
US20200192900A1 (en) Order-independent multi-record hash generation and data filtering
CN105989194B (en) Method and system for comparing table data
US10552413B2 (en) Database workload capture and replay
JP6697392B2 (en) Transparent discovery of semi-structured data schema
EP3748515B1 (en) Policy driven data placement and information lifecycle management
US10346434B1 (en) Partitioned data materialization in journal-based storage systems
US9779104B2 (en) Efficient database undo / redo logging
JP7170638B2 (en) Generating, Accessing, and Displaying Lineage Metadata
WO2017019879A1 (en) Multi-query optimization
JP2018505501A (en) Application-centric object storage
WO2015168988A1 (en) Data index creation method and device, and computer storage medium
WO2017070234A1 (en) Create table for exchange
Yang et al. F1 Lightning: HTAP as a Service
KR20160011212A (en) Managing memory and storage space for a data operation
US11615076B2 (en) Monolith database to distributed database transformation
Cubukcu et al. Citus: Distributed postgresql for data-intensive applications
US10235407B1 (en) Distributed storage system journal forking
US20190340272A1 (en) Systems and related methods for updating attributes of nodes and links in a hierarchical data structure
Pothuganti Big data analytics: Hadoop-Map reduce & NoSQL databases
Wang et al. QMapper for smart grid: Migrating SQL-based application to Hive
Gupta et al. Correlation and comparison of nosql specimen with relational data store

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14891367

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14891367

Country of ref document: EP

Kind code of ref document: A1