CN115587115A

CN115587115A - A database query optimization method and system

Info

Publication number: CN115587115A
Application number: CN202211587212.5A
Authority: CN
Inventors: 颜宇杰; 王亮; 张恒基; 李河东; 孙智贤
Original assignee: Southwest Petroleum University
Current assignee: Southwest Petroleum University
Priority date: 2022-12-12
Filing date: 2022-12-12
Publication date: 2023-01-10
Anticipated expiration: 2042-12-12
Also published as: CN115587115B

Abstract

The invention relates to the relevant field of database query, and discloses a database query optimization method and a system, wherein the database query optimization method comprises a retrieval space establishing module, a retrieval space mapping module, a characteristic association retrieval module and a data display verification module; by setting a physical optimization space independent of the database, the occupation of a data throughput channel of the database and the occupation of the operational capacity of the database in the retrieval process can be effectively reduced, and meanwhile, a multi-feature contact ratio retrieval method for performing characteristic scattering on the data objects is performed.

Description

A database query optimization method and system

技术领域technical field

本发明涉及数据库查询相关领域，具体是一种数据库查询优化方法及系统。The invention relates to the related field of database query, in particular to a database query optimization method and system.

背景技术Background technique

在数据库领域中，数据库中数据的查询能力是衡量数据库的重要项目之一，在数据查询的过程中，数据查询的效率过低，会导致数据库整体的反馈响应较差，难以满足用户对数据库中数据内容快速查询获取的需求；同时数据查询时对于数据库的数据吞吐量以及数据库运算能力的占用也会同样影响在数据查询过程中对于用户的体验反馈，以及影响对数据库的健康消耗速率。In the database field, the query capability of data in the database is one of the important items to measure the database. In the process of data query, the efficiency of data query is too low, which will lead to poor feedback response of the database as a whole, and it is difficult to satisfy the user's requirements for the database. The demand for fast query and acquisition of data content; at the same time, the data throughput of the database and the occupation of database computing power during data query will also affect the user experience feedback during the data query process, as well as the healthy consumption rate of the database.

现有技术中的数据查询方式，多是通过对用户多词条的“与”“或”关系对数据库进行无差边查询比对，占用大量数据吞吐及算力的同时，对数据内容的检索方式会导致在词条偏移时，多词条组合后，查询结果偏移导致无法正确查询需求数据内容的问题。Most of the data query methods in the prior art use the "and" and "or" relationships of multiple entries of the user to perform non-difference edge query comparisons on the database, which takes up a lot of data throughput and computing power while retrieving data content The method will lead to the problem that when the entry is offset, after the combination of multiple entries, the query result will be offset and the content of the required data cannot be correctly queried.

发明内容Contents of the invention

本发明的目的在于提供一种数据库查询优化方法及系统，以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a database query optimization method and system to solve the problems raised in the above-mentioned background technology.

为实现上述目的，本发明提供如下技术方案：To achieve the above object, the present invention provides the following technical solutions:

一种数据库查询优化系统，包含：A database query optimization system, comprising:

检索空间建立模块，用于构建检索的物理优化空间，所述物理优化空间包括优化存储单元以及优化检索单元，所述优化存储单元用于存储检索特征，所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元，所述物理优化空间与数据库通信连接；The retrieval space building module is used to construct a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used for storing retrieval features, and the optimized retrieval unit is used for responding to user requests Data query instructions to traverse the optimization storage unit, the physical optimization space is communicatively connected to the database;

检索空间映射模块，用于建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储，通过特征获取程序获取数据库中对应数据的检索特征，将所述检索特征与相对应的所述数据指向链接绑定，并对内容相同的所述检索特征进行合并，所述检索特征包括数据的标题特征、内容特征以及用户标记特征；The retrieval space mapping module is used to establish a data pointing link corresponding to the data in the database and store it in the optimized storage unit, obtain the retrieval characteristics of the corresponding data in the database through the characteristic acquisition program, and combine the retrieval characteristics with the corresponding The corresponding data points to link binding, and merges the retrieval features with the same content, and the retrieval features include data title features, content features, and user mark features;

特征关联检索模块，用于获取来自用户的数据查询指令，所述数据查询指令包含多组检索特征，基于所述检索特征依次对所述优化存储单元进行遍历，获取多个数据指向链接，并通过检索计数器对所述数据指向链接进行遍历响应计数，生成关联检索结果；The feature association retrieval module is used to obtain data query instructions from users, the data query instructions include multiple sets of retrieval features, traverse the optimized storage unit in sequence based on the retrieval features, obtain multiple data pointing links, and pass The retrieval counter performs traversal response counting on the links pointed to by the data, and generates associated retrieval results;

数据展示验证模块，用于基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列，并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览，输出所述验证预览并接收来自用户的查询确认反馈。The data display and verification module is configured to sort in descending order the several data-directed links in the associated search results based on the result of the traversal response count, and obtain part of the corresponding data in the database through the data-directed links to generate verification preview, outputting the verification preview and receiving query confirmation feedback from the user.

作为本发明的进一步方案：所述检索空间映射模块包括：As a further solution of the present invention: the retrieval space mapping module includes:

媒体特征获取模块，用于通过预设的媒体对象识别程序对所述媒体内容进行识别，获取所述媒体图像的元素内容构成，并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比，每个所述对照元素包括多个检索特征，相同对照元素的不同检索特征用于区别不同表达方式，所述特征占比用于当进行遍历响应计数时，赋予计数系数。A media feature acquisition module, configured to identify the media content through a preset media object recognition program, acquire the element content composition of the media image, and based on the match between the element content and the comparison element in the preset identification library The degree sets the feature ratio for the corresponding retrieval feature. Each comparison element includes multiple retrieval features. Different retrieval features of the same comparison element are used to distinguish different expressions. The feature ratio is used for traversal response counting When , the count coefficient is given.

作为本发明的再进一步方案：所述特征关联检索模块包括：As a further solution of the present invention: the feature association retrieval module includes:

附加筛选单元，用于接收来自用户的附加查询条件，并基于所述附加查询条件对所述数据指向链接进行筛选，所述附加查询条件独立于所述检索特征，且作用于每个所述数据指向链接以及数据库数据，所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。An additional filtering unit, configured to receive additional query conditions from users, and filter the data pointing links based on the additional query conditions, the additional query conditions are independent of the retrieval features and act on each of the data Pointing to links and database data, the additional query conditions include time information, file upload object, file type, and file data volume.

作为本发明的再进一步方案：还包括特征模糊模块，所述特征模糊模块具体包括：As a further solution of the present invention: it also includes a feature fuzzy module, and the feature fuzzy module specifically includes:

词汇模糊单元，用于对所述检索特征进行词汇模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值；The vocabulary fuzzy unit is used to fuzz the vocabulary of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and perform fuzzy retrieval on the basis of the overlap between the text expression and the retrieval feature. The feature assigns the proportion of the feature;

词义模糊单元，用于对所述检索特征进行词义模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值，所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。The meaning fuzzy unit is used to fuzzy the meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and perform fuzzy retrieval on the fuzzy retrieval feature based on a preset word meaning fuzzy grade library The assignment of feature proportions is carried out, and the word meaning fuzzy grade library includes a plurality of similar vocabulary storage spaces corresponding to different feature proportions.

作为本发明的再进一步方案：还包括对象关联模块；As a further solution of the present invention: it also includes an object association module;

所述对象关联模块，用于基于不同用户的检索偏好以及查询确认反馈的最终结果，建立不同用户的检索特征关联树，所述检索特征关联树用于表征用户在通过某一检索特征检索时，被检索对象可能包含的其它用户未被输入检索特征间的关联性。The object association module is used to establish the retrieval feature association tree of different users based on the retrieval preferences of different users and the final result of the query confirmation feedback, and the retrieval feature association tree is used to indicate that when a user searches through a certain retrieval feature, Other users that may be included in the retrieved object are not entered to retrieve the association between the features.

本发明实施例旨在提供一种数据库查询优化方法，包括步骤：The embodiment of the present invention aims to provide a database query optimization method, comprising steps:

构建检索的物理优化空间，所述物理优化空间包括优化存储单元以及优化检索单元，所述优化存储单元用于存储检索特征，所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元，所述物理优化空间与数据库通信连接；Construct a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used to store retrieval features, and the optimized retrieval unit is used to respond to data query instructions from users to traverse the optimized storage unit, the physical optimization space is communicatively connected to the database;

建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储，通过特征获取程序获取数据库中对应数据的检索特征，将所述检索特征与相对应的所述数据指向链接绑定，并对内容相同的所述检索特征进行合并，所述检索特征包括数据的标题特征、内容特征以及用户标记特征；Establishing a data pointing link corresponding to the data in the database and storing it in the optimized storage unit, obtaining the retrieval feature of the corresponding data in the database through a feature acquisition program, and linking the retrieval feature with the corresponding data pointing link Binding, and merging the retrieval features with the same content, the retrieval features include data title features, content features and user mark features;

获取来自用户的数据查询指令，所述数据查询指令包含多组检索特征，基于所述检索特征依次对所述优化存储单元进行遍历，获取多个数据指向链接，并通过检索计数器对所述数据指向链接进行遍历响应计数，生成关联检索结果；Obtaining a data query instruction from the user, the data query instruction includes multiple sets of retrieval features, sequentially traversing the optimized storage unit based on the retrieval features, obtaining multiple data pointing links, and pointing to the data through the retrieval counter Links traverse the response count and generate associated retrieval results;

基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列，并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览，输出所述验证预览并接收来自用户的查询确认反馈。Based on the result of the traversal response counting, sort the data pointing links in the associated retrieval results in descending order, and obtain part of the corresponding data in the database through the data pointing links to generate a verification preview, and output the verification preview And receive query confirmation feedback from the user.

作为本发明的进一步方案：所述通过特征获取程序获取数据库中对应数据的检索特征的步骤包括：As a further solution of the present invention: the step of acquiring the retrieval features of the corresponding data in the database through the feature acquisition program includes:

通过预设的媒体对象识别程序对所述媒体内容进行识别，获取所述媒体图像的元素内容构成，并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比，每个所述对照元素包括多个检索特征，相同对照元素的不同检索特征用于区别不同表达方式，所述特征占比用于当进行遍历响应计数时，赋予计数系数。Recognize the media content through a preset media object recognition program, obtain the composition of the element content of the media image, and compare the corresponding retrieval features based on the degree of conformity between the element content and the comparison elements in the preset recognition library Set feature proportions, each of the comparison elements includes multiple retrieval features, different retrieval features of the same comparison element are used to distinguish different expressions, and the feature proportions are used to assign counting coefficients when counting traversal responses.

作为本发明的再进一步方案：还包括附加检索步骤：As a further solution of the present invention: also include additional retrieval steps:

接收来自用户的附加查询条件，并基于所述附加查询条件对所述数据指向链接进行筛选，所述附加查询条件独立于所述检索特征，且作用于每个所述数据指向链接以及数据库数据，所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。receiving additional query conditions from the user, and filtering the data-directed links based on the additional query conditions, the additional query conditions are independent of the retrieval feature and act on each of the data-directed links and database data, The additional query conditions include time information, file upload object, file type and file data volume.

作为本发明的再进一步方案：还包括步骤：As a further solution of the present invention: also include steps:

对所述检索特征进行词汇模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值；Vocabulary fuzzing is performed on the retrieval feature, and the retrieval feature similar to the text expression of the retrieval feature is obtained as a fuzzy retrieval feature, and the feature proportion of the fuzzy retrieval feature is calculated based on the coincidence degree between the text expression and the retrieval feature assignment;

对所述检索特征进行词义模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值，所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。Fuzzy the word meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as the fuzzy retrieval feature, and assign the feature proportion to the fuzzy retrieval feature based on the preset word meaning fuzzy grade library , the word sense fuzzy grade library includes a plurality of similar vocabulary storage spaces respectively corresponding to different feature proportions.

基于不同用户的检索偏好以及查询确认反馈的最终结果，建立不同用户的检索特征关联树，所述检索特征关联树用于表征用户在通过某一检索特征检索时，被检索对象可能包含的其它用户未被输入检索特征间的关联性。Based on the retrieval preferences of different users and the final result of the query confirmation feedback, the retrieval feature association tree of different users is established. The retrieval feature association tree is used to represent other users that may be included in the retrieval object when the user searches through a certain retrieval feature Not entered to retrieve associations between features.

与现有技术相比，本发明的有益效果是：通过设置独立于数据库存在的物理优化空间，能够有效的降低检索过程中对于数据库数据吞吐通道的占用以及对数据库运算能力的占用，同时将数据对象进行特征化打散的多特征重合度检索办法，基于特征合并的基础上，可以大量的降低检索过程中的检索数据量，提升检索效率，能够实现在短时间内获取大量的特征符合数据对象并进行组合筛选。Compared with the prior art, the beneficial effect of the present invention is that by setting a physical optimization space independent of the database, the occupation of the database data throughput channel and the calculation capacity of the database can be effectively reduced during the retrieval process, and at the same time, the data The multi-feature coincidence degree retrieval method of characterization and dispersal of objects, based on feature merging, can greatly reduce the amount of retrieval data in the retrieval process, improve retrieval efficiency, and can obtain a large number of feature matching data objects in a short time and combined screening.

附图说明Description of drawings

图1为一种数据库查询优化系统的组成框图。Figure 1 is a block diagram of a database query optimization system.

图2为一种数据库查询优化系统中特征模糊模块的组成框图。Fig. 2 is a block diagram of a feature fuzzy module in a database query optimization system.

图3为一种数据库查询优化方法的流程框图。Fig. 3 is a flowchart of a database query optimization method.

具体实施方式detailed description

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

以下结合具体实施例对本发明的具体实现方式进行详细描述。The specific implementation of the present invention will be described in detail below in conjunction with specific embodiments.

如图1所述，为本发明一个实施例提供的一种数据库查询优化系统，包括：As shown in Figure 1, a database query optimization system provided by an embodiment of the present invention includes:

检索空间建立模块100，用于构建检索的物理优化空间，所述物理优化空间包括优化存储单元以及优化检索单元，所述优化存储单元用于存储检索特征，所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元，所述物理优化空间与数据库通信连接。The retrieval space building module 100 is used to construct a physical optimization space for retrieval, and the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used to store retrieval features, and the optimized retrieval unit is used to respond to requests from users The data query instructions are used to traverse the optimization storage unit, and the physical optimization space is communicatively connected to the database.

检索空间映射模块300，用于建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储，通过特征获取程序获取数据库中对应数据的检索特征，将所述检索特征与相对应的所述数据指向链接绑定，并对内容相同的所述检索特征进行合并，所述检索特征包括数据的标题特征、内容特征以及用户标记特征。The retrieval space mapping module 300 is used to establish a data pointing link corresponding to the data in the database and store it in the optimized storage unit, obtain the retrieval feature of the corresponding data in the database through the feature acquisition program, and combine the retrieval feature with the The corresponding data points to link binding, and the retrieval features with the same content are combined, and the retrieval features include data title features, content features, and user mark features.

特征关联检索模块500，用于获取来自用户的数据查询指令，所述数据查询指令包含多组检索特征，基于所述检索特征依次对所述优化存储单元进行遍历，获取多个数据指向链接，并通过检索计数器对所述数据指向链接进行遍历响应计数，生成关联检索结果。The feature association retrieval module 500 is used to obtain data query instructions from users, the data query instructions include multiple sets of retrieval features, and sequentially traverse the optimized storage unit based on the retrieval features to obtain multiple data pointing links, and The search counter counts the traversal responses of the links pointed to by the data to generate an associated search result.

数据展示验证模块700，用于基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列，并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览，输出所述验证预览并接收来自用户的查询确认反馈。The data display verification module 700 is configured to sort in descending order the several data-directed links in the associated search results based on the result of the traversal response count, and obtain part of the content of the corresponding data in the database through the data-directed links to generate Verifying the preview, outputting the verification preview and receiving query confirmation feedback from the user.

本实施例中，给出了一种数据查询优化系统，通过设置独立于数据库存在的物理优化空间，能够有效的降低检索过程中对于数据库数据吞吐通道的占用以及对数据库运算能力的占用，同时将数据对象进行特征化打散的多特征重合度检索办法，基于特征合并的基础上，可以大量的降低检索过程中的检索数据量，提升检索效率，能够实现在短时间内获取大量的特征符合数据对象并进行组合筛选；具体的，建立一个与数据库并列并连接的物理优化空间，当数据库中存入新数据时，对其进行检索特征（数据特征，例如标题的关键词，数据本体的内容关键词或高频率词汇等）进行获取，建立一个与数据库中该数据链接的数据指向链接并与每个特征进行绑定，同时对特征与物理优化空间中的相同特征进行合并，这样在检索时，当用户输入一个检索特征时，基于该检索特征可以获得多个数据指向链接，当用户输入多个检索关键词时，便会出现一个数据指向链接被多次读取的情况，这种情况下采取对数据指向链接进行计数的方式，这样就能够基于用户输入的检索关键词得知某一数据指向链接被读取次数最多，基于此进行查询检索内容的输出，输出时可以通过数据库获取部分数据内容进行预览以方便用户确认，相交于享有技术中的多关键词并列时的“与或”逻辑的检索方式，本申请不直接对数据内容进行检索，而是基于关键词检索进行对数据进行一个被检索计数标记，可以有效的避免现有“与或”检索逻辑在检索时因关键词较多导致的数据检索结果偏移，甚至最终无法检索到正确数据内容的问题，本申请中，检索关键词的叠加，只会增加检索计数的准确性，不会导致检索结果的成倍减少。In this embodiment, a data query optimization system is provided. By setting a physical optimization space independent of the database, it can effectively reduce the occupancy of the database data throughput channel and the occupancy of the database computing capacity during the retrieval process. The multi-feature coincidence degree retrieval method of characterization and dispersal of data objects, based on feature merging, can greatly reduce the amount of retrieval data in the retrieval process, improve retrieval efficiency, and achieve a large amount of feature matching data in a short time Objects are combined and screened; specifically, a physical optimization space paralleled and connected to the database is established, and when new data is stored in the database, it is searched for features (data features, such as the keywords of the title, the content keys of the data ontology) Words or high-frequency vocabulary, etc.), establish a data pointing link with the data link in the database and bind it to each feature, and at the same time merge the feature with the same feature in the physical optimization space, so that when searching, When the user enters a retrieval feature, multiple data-pointing links can be obtained based on the retrieval feature. When the user enters multiple retrieval keywords, a data-pointing link will be read multiple times. In this case, take The method of counting data pointing links, so that based on the search keywords entered by the user, it can be known that a certain data pointing link has been read the most times, and based on this, the output of the query and retrieval content can be obtained through the database when outputting part of the data content Previewing is performed to facilitate user confirmation, which intersects with the "and or" logical retrieval method when multiple keywords are juxtaposed in the shared technology. This application does not directly retrieve data content, but conducts a data retrieval based on keyword retrieval. The search count mark can effectively avoid the problem that the existing "and or" retrieval logic causes the data retrieval results to shift due to more keywords during retrieval, and even the correct data content cannot be retrieved in the end. In this application, the retrieval keyword The superposition of , will only increase the accuracy of the retrieval count, and will not lead to a multiplied reduction of the retrieval results.

作为本发明另一个优选的实施例，所述检索空间映射模块300包括：As another preferred embodiment of the present invention, the retrieval space mapping module 300 includes:

本实施例中，这里对检索空间映射模块300补充了媒体特征获取模块，是取决于数据类型而设置的，因为媒体类型的数据（这里主要指的是图像媒体以及视频媒体，其中视频媒体可以视为连续的多组图片媒体）与基本的文本等类型的数据内容不同，其检索特征（即内容的关键词）可以是不确定的，是对于同一图像元素，其关键词的表达方式可以是多种的，同时相近的元素表达方式可能也较为相近，因此需要进行特别的内容元素识别以及关键词的提取，且每个元素对应多个关键词，不同相近的表达方式也应当设置一定的特征占比用于表达其二者的重合程度，以用于在查询检索时进行系数乘算，提升计数过程中产生数据的可信度。In this embodiment, the media feature acquisition module is added to the retrieval space mapping module 300, which is set depending on the data type, because the data of the media type (here mainly refers to image media and video media, wherein video media can be viewed as Different from basic text and other types of data content, its retrieval characteristics (that is, the keywords of the content) can be uncertain. For the same image element, the expression of the keywords can be multiple At the same time, the expressions of similar elements may be relatively similar, so special content element identification and keyword extraction are required, and each element corresponds to multiple keywords, and different and similar expressions should also be set with certain features. The ratio is used to express the degree of overlap between the two, and is used for multiplication of coefficients during query and retrieval to improve the credibility of the data generated during the counting process.

作为本发明另一个优选的实施例，所述特征关联检索模块500包括：As another preferred embodiment of the present invention, the feature association retrieval module 500 includes:

本实施例中，这里的附加筛选单元，可以通过设置附加查询条件来缩小检索的范围，例如设置检索数据对象的时间范围，则可以大量的降低检索数据量，也能够有效的提升检索的成功率。In this embodiment, the additional screening unit here can narrow the scope of retrieval by setting additional query conditions, such as setting the time range of retrieval data objects, which can greatly reduce the amount of retrieval data, and can also effectively improve the success rate of retrieval .

如图2所示，作为本发明另一个优选的实施例，还包括特征模糊模块900，所述特征模糊模块900具体包括：As shown in FIG. 2, as another preferred embodiment of the present invention, a feature fuzzy module 900 is also included, and the feature fuzzy module 900 specifically includes:

词汇模糊单元901，用于对所述检索特征进行词汇模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值。The vocabulary fuzzing unit 901 is used to perform vocabulary fuzzing on the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and fuzzy the fuzzy retrieval feature based on the degree of coincidence between the text expression and the retrieval feature. Retrieve features to assign the proportion of features.

词义模糊单元902，用于对所述检索特征进行词义模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值，所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。The word meaning fuzzy unit 902 is used to fuzzy the meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and perform fuzzy retrieval on the basis of the preset word meaning fuzzy grade library The feature assigns the feature ratio, and the word meaning fuzzy hierarchy includes a plurality of similar vocabulary storage spaces corresponding to different feature ratios.

本实施例中，特征模糊模块900的作用是在检索过程中对用户输入的词汇进行模糊联想，可以用于当用户对需要检索的数据的内容记忆有误或者不准确时，可以扩大检索范围，提升检索到对应内容的成功率，当然该模块在用户检索不到对应内容时，基于需要进行开启，其中，因用户的错误表达可能包括关键词的文字内容相近以及词义相近两种，故包括词汇模糊单元901以及词义模糊单元902。In this embodiment, the function of the feature fuzzy module 900 is to perform fuzzy association on the vocabulary input by the user during the retrieval process, which can be used to expand the retrieval range when the user has a wrong or inaccurate memory of the content of the data to be retrieved. Improve the success rate of retrieving the corresponding content. Of course, when the user cannot retrieve the corresponding content, this module will be opened based on needs. Among them, because the user's wrong expression may include similar text content and similar meaning of the keyword, it includes vocabulary Fuzzy unit 901 and word meaning fuzzy unit 902 .

作为本发明另一个优选的实施例，还包括对象关联模块；As another preferred embodiment of the present invention, it also includes an object association module;

本实施例中，对象关联模块的是针对不同的用户设置的，根据用户的检索查询习惯，可以建立属于该用户的检索关键词的关联树，能够用于提升在检索查询过程中的系统联想模糊准确性。In this embodiment, the object association module is set for different users. According to the user's search and query habits, an association tree of search keywords belonging to the user can be established, which can be used to improve the system association fuzziness in the search and query process. accuracy.

如图3所示，本发明还提供了一种数据库查询优化方法，其包含步骤：As shown in Figure 3, the present invention also provides a kind of database query optimization method, and it comprises steps:

S200，构建检索的物理优化空间，所述物理优化空间包括优化存储单元以及优化检索单元，所述优化存储单元用于存储检索特征，所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元，所述物理优化空间与数据库通信连接。S200, constructing a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used to store retrieval features, and the optimized retrieval unit is used to respond to data query instructions from users to traverse An optimized storage unit, the physically optimized space is communicatively connected to the database.

S400，建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储，通过特征获取程序获取数据库中对应数据的检索特征，将所述检索特征与相对应的所述数据指向链接绑定，并对内容相同的所述检索特征进行合并，所述检索特征包括数据的标题特征、内容特征以及用户标记特征。S400, establish a data pointing link corresponding to the data in the database and store it in the optimized storage unit, obtain the retrieval feature of the corresponding data in the database through the feature acquisition program, and associate the retrieval feature with the corresponding data Point to link binding, and merge the retrieval features with the same content, the retrieval features include data title features, content features and user mark features.

S600，获取来自用户的数据查询指令，所述数据查询指令包含多组检索特征，基于所述检索特征依次对所述优化存储单元进行遍历，获取多个数据指向链接，并通过检索计数器对所述数据指向链接进行遍历响应计数，生成关联检索结果。S600. Obtain a data query instruction from the user, the data query instruction includes multiple sets of retrieval features, sequentially traverse the optimized storage unit based on the retrieval features, obtain multiple data pointing links, and use the retrieval counter to search the optimized storage unit The data points to the link to traverse the response count and generate the associated retrieval result.

S800，基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列，并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览，输出所述验证预览并接收来自用户的查询确认反馈。S800. Arrange in descending order the several data-pointing links in the associated search results based on the result of the traversal response counting, and obtain part of the corresponding data in the database through the data-pointing links to generate a verification preview, and output the Validate previews and receive query confirmation feedback from users.

作为本发明另一个优选的实施例，所述通过特征获取程序获取数据库中对应数据的检索特征的步骤包括：As another preferred embodiment of the present invention, the step of acquiring the retrieval features of the corresponding data in the database through the feature acquisition program includes:

作为本发明另一个优选的实施例，所述生成关联检索结果的步骤中，还包括附加检索步骤：As another preferred embodiment of the present invention, the step of generating associated retrieval results further includes an additional retrieval step:

作为本发明另一个优选的实施例，还包括步骤：As another preferred embodiment of the present invention, it also includes the steps of:

对所述检索特征进行词汇模糊，获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征，并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值。Vocabulary fuzzing is performed on the retrieval feature, and the retrieval feature similar to the text expression of the retrieval feature is obtained as a fuzzy retrieval feature, and the feature proportion of the fuzzy retrieval feature is calculated based on the coincidence degree between the text expression and the retrieval feature assignment.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一非易失性计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器（ROM）、可编程ROM（PROM）、电可编程ROM（EPROM）、电可擦除可编程ROM（EEPROM）或闪存。易失性存储器可包括随机存取存储器（RAM）或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM（SRAM）、动态RAM（DRAM）、同步DRAM（SDRAM）、双数据率SDRAM（DDRSDRAM）、增强型SDRAM（ESDRAM）、同步链路（Synchlink） DRAM（SLDRAM）、存储器总线（Rambus）直接RAM（RDRAM）、直接存储器总线动态RAM（DRDRAM）、以及存储器总线动态RAM（RDRAM）等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized through computer programs to instruct related hardware, and the programs can be stored in a non-volatile computer-readable storage medium When the program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

本领域技术人员在考虑说明书及实施例处的公开后，将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art from consideration of the disclosure at the specification and examples. This application is intended to cover any modification, use or adaptation of the present disclosure, and these modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure . The specification and examples are to be considered exemplary only, with the true scope and spirit of the disclosure indicated by the appended claims.

应当理解的是，本公开并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It should be understood that the present disclosure is not limited to the precise constructions which have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A database query optimization system, comprising:

the system comprises a retrieval space establishing module, a database searching module and a searching module, wherein the retrieval space establishing module is used for establishing a physical optimized space for retrieval, the physical optimized space comprises an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used for storing retrieval characteristics, the optimized retrieval unit is used for responding to a data query instruction from a user to traverse the optimized storage unit, and the physical optimized space is in communication connection with the database;

the retrieval space mapping module is used for establishing data pointing links which are in one-to-one corresponding connection with data in a database and storing the data pointing links in the optimized storage unit, acquiring retrieval characteristics of corresponding data in the database through a characteristic acquisition program, binding the retrieval characteristics with the corresponding data pointing links, and merging the retrieval characteristics with the same content, wherein the retrieval characteristics comprise title characteristics, content characteristics and user mark characteristics of the data;

the characteristic correlation retrieval module is used for acquiring a data query instruction from a user, wherein the data query instruction comprises a plurality of groups of retrieval characteristics, traversing the optimized storage unit in sequence based on the retrieval characteristics to acquire a plurality of data pointing links, and performing traversal response counting on the data pointing links through a retrieval counter to generate a correlation retrieval result;

and the data display verification module is used for performing descending order arrangement on a plurality of data pointing links in the associated retrieval result based on the result of the traversal response counting, acquiring partial content of corresponding data in a database through the data pointing links to generate verification preview, outputting the verification preview and receiving query confirmation feedback from a user.

2. The database query optimization system of claim 1, wherein the search space mapping module comprises:

the media feature acquisition module is used for identifying the media content through a preset media object identification program, acquiring element content constitution of the media image, and setting feature ratio for corresponding retrieval features based on the matching degree of the element content and comparison elements in a preset identification library, wherein each comparison element comprises a plurality of retrieval features, different retrieval features of the same comparison element are used for distinguishing different expression modes, and the feature ratio is used for endowing a counting coefficient when traversing response counting is carried out.

3. The database query optimization system of claim 2, wherein the feature association retrieval module comprises:

and the additional screening unit is used for receiving an additional query condition from a user and screening the data pointing links based on the additional query condition, wherein the additional query condition is independent of the retrieval characteristics and acts on each data pointing link and the database data, and the additional query condition comprises time information, a file uploading object, a file type and a file data volume.

4. The database query optimization system according to claim 3, further comprising a feature fuzzy module, wherein the feature fuzzy module specifically comprises:

the word fuzzy unit is used for carrying out word fuzzy on the retrieval characteristics, acquiring the retrieval characteristics close to the character expression of the retrieval characteristics as fuzzy retrieval characteristics, and carrying out characteristic proportion assignment on the fuzzy retrieval characteristics based on the coincidence degree of the character expression and the retrieval characteristics;

and the word meaning fuzzy unit is used for carrying out word meaning fuzzy on the retrieval characteristics, acquiring the retrieval characteristics which are close to the character expression of the retrieval characteristics as fuzzy retrieval characteristics, and carrying out characteristic ratio assignment on the fuzzy retrieval characteristics based on a preset word meaning fuzzy level library, wherein the word meaning fuzzy level library comprises a plurality of similar vocabulary storage spaces respectively corresponding to different characteristic ratios.

5. The database query optimization system of claim 1, further comprising an object association module;

the object association module is used for establishing retrieval feature association trees of different users based on retrieval preferences of the different users and final results of query confirmation feedback, and the retrieval feature association trees are used for representing the association between retrieval features which are not input by other users possibly contained in the retrieved object when the user retrieves through a certain retrieval feature.

6. A method for optimizing a database query, comprising the steps of:

constructing a retrieved physical optimization space, wherein the physical optimization space comprises an optimization storage unit and an optimization retrieval unit, the optimization storage unit is used for storing retrieval characteristics, the optimization retrieval unit is used for responding to a data query instruction from a user to traverse the optimization storage unit, and the physical optimization space is in communication connection with a database;

establishing data pointing links in one-to-one corresponding connection with data in a database and storing the data pointing links in the optimized storage unit, acquiring retrieval characteristics of corresponding data in the database through a characteristic acquisition program, binding the retrieval characteristics with the corresponding data pointing links, and merging the retrieval characteristics with the same content, wherein the retrieval characteristics comprise a title characteristic, a content characteristic and a user mark characteristic of the data;

acquiring a data query instruction from a user, wherein the data query instruction comprises a plurality of groups of retrieval characteristics, traversing the optimized storage unit in sequence based on the retrieval characteristics to acquire a plurality of data pointing links, and traversing response counting is performed on the data pointing links through a retrieval counter to generate an associated retrieval result;

and performing descending order arrangement on a plurality of data pointing links in the associated retrieval result based on the result of the traversal response count, acquiring partial content of corresponding data in a database through the data pointing links to generate a verification preview, outputting the verification preview and receiving query confirmation feedback from a user.

7. The method according to claim 6, wherein the step of obtaining the search features of the corresponding data in the database by the feature obtaining program comprises:

the media content is identified through a preset media object identification program, the element content constitution of the media image is obtained, feature proportion is set for corresponding retrieval features based on the matching degree of the element content and comparison elements in a preset identification library, each comparison element comprises a plurality of retrieval features, different retrieval features of the same comparison element are used for distinguishing different expression modes, and the feature proportion is used for endowing a counting coefficient when traversing response counting is carried out.

8. The database query optimization method of claim 7, further comprising the additional retrieval step of:

receiving an additional query condition from a user, and screening the data pointing links based on the additional query condition, wherein the additional query condition is independent of the retrieval characteristics and acts on each data pointing link and database data, and the additional query condition comprises time information, a file uploading object, a file type and a file data volume.

9. The database query optimization method according to claim 8, further comprising the steps of:

carrying out vocabulary fuzzy on the retrieval characteristics, acquiring the retrieval characteristics close to the character expression of the retrieval characteristics as fuzzy retrieval characteristics, and carrying out characteristic ratio assignment on the fuzzy retrieval characteristics based on the coincidence degree of the character expression and the retrieval characteristics;

and carrying out word sense fuzzy on the retrieval characteristics, acquiring the retrieval characteristics close to the character expression of the retrieval characteristics as fuzzy retrieval characteristics, and carrying out characteristic ratio assignment on the fuzzy retrieval characteristics based on a preset word sense fuzzy level library, wherein the word sense fuzzy level library comprises a plurality of similar vocabulary storage spaces respectively corresponding to different characteristic ratios.

10. The database query optimization method according to claim 6, further comprising the steps of:

and establishing a retrieval feature association tree of different users based on the retrieval preference of different users and the final result of query confirmation feedback, wherein the retrieval feature association tree is used for representing the association between the retrieval features which are not input by other users possibly contained in the object to be retrieved when the user retrieves through a certain retrieval feature.