CN115587115A - A database query optimization method and system - Google Patents
A database query optimization method and system Download PDFInfo
- Publication number
- CN115587115A CN115587115A CN202211587212.5A CN202211587212A CN115587115A CN 115587115 A CN115587115 A CN 115587115A CN 202211587212 A CN202211587212 A CN 202211587212A CN 115587115 A CN115587115 A CN 115587115A
- Authority
- CN
- China
- Prior art keywords
- retrieval
- data
- database
- feature
- fuzzy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012795 verification Methods 0.000 claims abstract description 14
- 238000013507 mapping Methods 0.000 claims abstract description 8
- 230000014509 gene expression Effects 0.000 claims description 28
- 230000004044 response Effects 0.000 claims description 19
- 238000012790 confirmation Methods 0.000 claims description 13
- 238000012216 screening Methods 0.000 claims description 5
- 238000004891 communication Methods 0.000 claims 2
- 230000008569 process Effects 0.000 abstract description 12
- 238000001914 filtration Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2468—Fuzzy queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Automation & Control Theory (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及数据库查询相关领域,具体是一种数据库查询优化方法及系统。The invention relates to the related field of database query, in particular to a database query optimization method and system.
背景技术Background technique
在数据库领域中,数据库中数据的查询能力是衡量数据库的重要项目之一,在数据查询的过程中,数据查询的效率过低,会导致数据库整体的反馈响应较差,难以满足用户对数据库中数据内容快速查询获取的需求;同时数据查询时对于数据库的数据吞吐量以及数据库运算能力的占用也会同样影响在数据查询过程中对于用户的体验反馈,以及影响对数据库的健康消耗速率。In the database field, the query capability of data in the database is one of the important items to measure the database. In the process of data query, the efficiency of data query is too low, which will lead to poor feedback response of the database as a whole, and it is difficult to satisfy the user's requirements for the database. The demand for fast query and acquisition of data content; at the same time, the data throughput of the database and the occupation of database computing power during data query will also affect the user experience feedback during the data query process, as well as the healthy consumption rate of the database.
现有技术中的数据查询方式,多是通过对用户多词条的“与”“或”关系对数据库进行无差边查询比对,占用大量数据吞吐及算力的同时,对数据内容的检索方式会导致在词条偏移时,多词条组合后,查询结果偏移导致无法正确查询需求数据内容的问题。Most of the data query methods in the prior art use the "and" and "or" relationships of multiple entries of the user to perform non-difference edge query comparisons on the database, which takes up a lot of data throughput and computing power while retrieving data content The method will lead to the problem that when the entry is offset, after the combination of multiple entries, the query result will be offset and the content of the required data cannot be correctly queried.
发明内容Contents of the invention
本发明的目的在于提供一种数据库查询优化方法及系统,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a database query optimization method and system to solve the problems raised in the above-mentioned background technology.
为实现上述目的,本发明提供如下技术方案:To achieve the above object, the present invention provides the following technical solutions:
一种数据库查询优化系统,包含:A database query optimization system, comprising:
检索空间建立模块,用于构建检索的物理优化空间,所述物理优化空间包括优化存储单元以及优化检索单元,所述优化存储单元用于存储检索特征,所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元,所述物理优化空间与数据库通信连接;The retrieval space building module is used to construct a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used for storing retrieval features, and the optimized retrieval unit is used for responding to user requests Data query instructions to traverse the optimization storage unit, the physical optimization space is communicatively connected to the database;
检索空间映射模块,用于建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储,通过特征获取程序获取数据库中对应数据的检索特征,将所述检索特征与相对应的所述数据指向链接绑定,并对内容相同的所述检索特征进行合并,所述检索特征包括数据的标题特征、内容特征以及用户标记特征;The retrieval space mapping module is used to establish a data pointing link corresponding to the data in the database and store it in the optimized storage unit, obtain the retrieval characteristics of the corresponding data in the database through the characteristic acquisition program, and combine the retrieval characteristics with the corresponding The corresponding data points to link binding, and merges the retrieval features with the same content, and the retrieval features include data title features, content features, and user mark features;
特征关联检索模块,用于获取来自用户的数据查询指令,所述数据查询指令包含多组检索特征,基于所述检索特征依次对所述优化存储单元进行遍历,获取多个数据指向链接,并通过检索计数器对所述数据指向链接进行遍历响应计数,生成关联检索结果;The feature association retrieval module is used to obtain data query instructions from users, the data query instructions include multiple sets of retrieval features, traverse the optimized storage unit in sequence based on the retrieval features, obtain multiple data pointing links, and pass The retrieval counter performs traversal response counting on the links pointed to by the data, and generates associated retrieval results;
数据展示验证模块,用于基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列,并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览,输出所述验证预览并接收来自用户的查询确认反馈。The data display and verification module is configured to sort in descending order the several data-directed links in the associated search results based on the result of the traversal response count, and obtain part of the corresponding data in the database through the data-directed links to generate verification preview, outputting the verification preview and receiving query confirmation feedback from the user.
作为本发明的进一步方案:所述检索空间映射模块包括:As a further solution of the present invention: the retrieval space mapping module includes:
媒体特征获取模块,用于通过预设的媒体对象识别程序对所述媒体内容进行识别,获取所述媒体图像的元素内容构成,并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比,每个所述对照元素包括多个检索特征,相同对照元素的不同检索特征用于区别不同表达方式,所述特征占比用于当进行遍历响应计数时,赋予计数系数。A media feature acquisition module, configured to identify the media content through a preset media object recognition program, acquire the element content composition of the media image, and based on the match between the element content and the comparison element in the preset identification library The degree sets the feature ratio for the corresponding retrieval feature. Each comparison element includes multiple retrieval features. Different retrieval features of the same comparison element are used to distinguish different expressions. The feature ratio is used for traversal response counting When , the count coefficient is given.
作为本发明的再进一步方案:所述特征关联检索模块包括:As a further solution of the present invention: the feature association retrieval module includes:
附加筛选单元,用于接收来自用户的附加查询条件,并基于所述附加查询条件对所述数据指向链接进行筛选,所述附加查询条件独立于所述检索特征,且作用于每个所述数据指向链接以及数据库数据,所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。An additional filtering unit, configured to receive additional query conditions from users, and filter the data pointing links based on the additional query conditions, the additional query conditions are independent of the retrieval features and act on each of the data Pointing to links and database data, the additional query conditions include time information, file upload object, file type, and file data volume.
作为本发明的再进一步方案:还包括特征模糊模块,所述特征模糊模块具体包括:As a further solution of the present invention: it also includes a feature fuzzy module, and the feature fuzzy module specifically includes:
词汇模糊单元,用于对所述检索特征进行词汇模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值;The vocabulary fuzzy unit is used to fuzz the vocabulary of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and perform fuzzy retrieval on the basis of the overlap between the text expression and the retrieval feature. The feature assigns the proportion of the feature;
词义模糊单元,用于对所述检索特征进行词义模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值,所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。The meaning fuzzy unit is used to fuzzy the meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as a fuzzy retrieval feature, and perform fuzzy retrieval on the fuzzy retrieval feature based on a preset word meaning fuzzy grade library The assignment of feature proportions is carried out, and the word meaning fuzzy grade library includes a plurality of similar vocabulary storage spaces corresponding to different feature proportions.
作为本发明的再进一步方案:还包括对象关联模块;As a further solution of the present invention: it also includes an object association module;
所述对象关联模块,用于基于不同用户的检索偏好以及查询确认反馈的最终结果,建立不同用户的检索特征关联树,所述检索特征关联树用于表征用户在通过某一检索特征检索时,被检索对象可能包含的其它用户未被输入检索特征间的关联性。The object association module is used to establish the retrieval feature association tree of different users based on the retrieval preferences of different users and the final result of the query confirmation feedback, and the retrieval feature association tree is used to indicate that when a user searches through a certain retrieval feature, Other users that may be included in the retrieved object are not entered to retrieve the association between the features.
本发明实施例旨在提供一种数据库查询优化方法,包括步骤:The embodiment of the present invention aims to provide a database query optimization method, comprising steps:
构建检索的物理优化空间,所述物理优化空间包括优化存储单元以及优化检索单元,所述优化存储单元用于存储检索特征,所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元,所述物理优化空间与数据库通信连接;Construct a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used to store retrieval features, and the optimized retrieval unit is used to respond to data query instructions from users to traverse the optimized storage unit, the physical optimization space is communicatively connected to the database;
建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储,通过特征获取程序获取数据库中对应数据的检索特征,将所述检索特征与相对应的所述数据指向链接绑定,并对内容相同的所述检索特征进行合并,所述检索特征包括数据的标题特征、内容特征以及用户标记特征;Establishing a data pointing link corresponding to the data in the database and storing it in the optimized storage unit, obtaining the retrieval feature of the corresponding data in the database through a feature acquisition program, and linking the retrieval feature with the corresponding data pointing link Binding, and merging the retrieval features with the same content, the retrieval features include data title features, content features and user mark features;
获取来自用户的数据查询指令,所述数据查询指令包含多组检索特征,基于所述检索特征依次对所述优化存储单元进行遍历,获取多个数据指向链接,并通过检索计数器对所述数据指向链接进行遍历响应计数,生成关联检索结果;Obtaining a data query instruction from the user, the data query instruction includes multiple sets of retrieval features, sequentially traversing the optimized storage unit based on the retrieval features, obtaining multiple data pointing links, and pointing to the data through the retrieval counter Links traverse the response count and generate associated retrieval results;
基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列,并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览,输出所述验证预览并接收来自用户的查询确认反馈。Based on the result of the traversal response counting, sort the data pointing links in the associated retrieval results in descending order, and obtain part of the corresponding data in the database through the data pointing links to generate a verification preview, and output the verification preview And receive query confirmation feedback from the user.
作为本发明的进一步方案:所述通过特征获取程序获取数据库中对应数据的检索特征的步骤包括:As a further solution of the present invention: the step of acquiring the retrieval features of the corresponding data in the database through the feature acquisition program includes:
通过预设的媒体对象识别程序对所述媒体内容进行识别,获取所述媒体图像的元素内容构成,并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比,每个所述对照元素包括多个检索特征,相同对照元素的不同检索特征用于区别不同表达方式,所述特征占比用于当进行遍历响应计数时,赋予计数系数。Recognize the media content through a preset media object recognition program, obtain the composition of the element content of the media image, and compare the corresponding retrieval features based on the degree of conformity between the element content and the comparison elements in the preset recognition library Set feature proportions, each of the comparison elements includes multiple retrieval features, different retrieval features of the same comparison element are used to distinguish different expressions, and the feature proportions are used to assign counting coefficients when counting traversal responses.
作为本发明的再进一步方案:还包括附加检索步骤:As a further solution of the present invention: also include additional retrieval steps:
接收来自用户的附加查询条件,并基于所述附加查询条件对所述数据指向链接进行筛选,所述附加查询条件独立于所述检索特征,且作用于每个所述数据指向链接以及数据库数据,所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。receiving additional query conditions from the user, and filtering the data-directed links based on the additional query conditions, the additional query conditions are independent of the retrieval feature and act on each of the data-directed links and database data, The additional query conditions include time information, file upload object, file type and file data volume.
作为本发明的再进一步方案:还包括步骤:As a further solution of the present invention: also include steps:
对所述检索特征进行词汇模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值;Vocabulary fuzzing is performed on the retrieval feature, and the retrieval feature similar to the text expression of the retrieval feature is obtained as a fuzzy retrieval feature, and the feature proportion of the fuzzy retrieval feature is calculated based on the coincidence degree between the text expression and the retrieval feature assignment;
对所述检索特征进行词义模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值,所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。Fuzzy the word meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as the fuzzy retrieval feature, and assign the feature proportion to the fuzzy retrieval feature based on the preset word meaning fuzzy grade library , the word sense fuzzy grade library includes a plurality of similar vocabulary storage spaces respectively corresponding to different feature proportions.
作为本发明的再进一步方案:还包括步骤:As a further solution of the present invention: also include steps:
基于不同用户的检索偏好以及查询确认反馈的最终结果,建立不同用户的检索特征关联树,所述检索特征关联树用于表征用户在通过某一检索特征检索时,被检索对象可能包含的其它用户未被输入检索特征间的关联性。Based on the retrieval preferences of different users and the final result of the query confirmation feedback, the retrieval feature association tree of different users is established. The retrieval feature association tree is used to represent other users that may be included in the retrieval object when the user searches through a certain retrieval feature Not entered to retrieve associations between features.
与现有技术相比,本发明的有益效果是:通过设置独立于数据库存在的物理优化空间,能够有效的降低检索过程中对于数据库数据吞吐通道的占用以及对数据库运算能力的占用,同时将数据对象进行特征化打散的多特征重合度检索办法,基于特征合并的基础上,可以大量的降低检索过程中的检索数据量,提升检索效率,能够实现在短时间内获取大量的特征符合数据对象并进行组合筛选。Compared with the prior art, the beneficial effect of the present invention is that by setting a physical optimization space independent of the database, the occupation of the database data throughput channel and the calculation capacity of the database can be effectively reduced during the retrieval process, and at the same time, the data The multi-feature coincidence degree retrieval method of characterization and dispersal of objects, based on feature merging, can greatly reduce the amount of retrieval data in the retrieval process, improve retrieval efficiency, and can obtain a large number of feature matching data objects in a short time and combined screening.
附图说明Description of drawings
图1为一种数据库查询优化系统的组成框图。Figure 1 is a block diagram of a database query optimization system.
图2为一种数据库查询优化系统中特征模糊模块的组成框图。Fig. 2 is a block diagram of a feature fuzzy module in a database query optimization system.
图3为一种数据库查询优化方法的流程框图。Fig. 3 is a flowchart of a database query optimization method.
具体实施方式detailed description
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
以下结合具体实施例对本发明的具体实现方式进行详细描述。The specific implementation of the present invention will be described in detail below in conjunction with specific embodiments.
如图1所述,为本发明一个实施例提供的一种数据库查询优化系统,包括:As shown in Figure 1, a database query optimization system provided by an embodiment of the present invention includes:
检索空间建立模块100,用于构建检索的物理优化空间,所述物理优化空间包括优化存储单元以及优化检索单元,所述优化存储单元用于存储检索特征,所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元,所述物理优化空间与数据库通信连接。The retrieval
检索空间映射模块300,用于建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储,通过特征获取程序获取数据库中对应数据的检索特征,将所述检索特征与相对应的所述数据指向链接绑定,并对内容相同的所述检索特征进行合并,所述检索特征包括数据的标题特征、内容特征以及用户标记特征。The retrieval
特征关联检索模块500,用于获取来自用户的数据查询指令,所述数据查询指令包含多组检索特征,基于所述检索特征依次对所述优化存储单元进行遍历,获取多个数据指向链接,并通过检索计数器对所述数据指向链接进行遍历响应计数,生成关联检索结果。The feature
数据展示验证模块700,用于基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列,并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览,输出所述验证预览并接收来自用户的查询确认反馈。The data
本实施例中,给出了一种数据查询优化系统,通过设置独立于数据库存在的物理优化空间,能够有效的降低检索过程中对于数据库数据吞吐通道的占用以及对数据库运算能力的占用,同时将数据对象进行特征化打散的多特征重合度检索办法,基于特征合并的基础上,可以大量的降低检索过程中的检索数据量,提升检索效率,能够实现在短时间内获取大量的特征符合数据对象并进行组合筛选;具体的,建立一个与数据库并列并连接的物理优化空间,当数据库中存入新数据时,对其进行检索特征(数据特征,例如标题的关键词,数据本体的内容关键词或高频率词汇等)进行获取,建立一个与数据库中该数据链接的数据指向链接并与每个特征进行绑定,同时对特征与物理优化空间中的相同特征进行合并,这样在检索时,当用户输入一个检索特征时,基于该检索特征可以获得多个数据指向链接,当用户输入多个检索关键词时,便会出现一个数据指向链接被多次读取的情况,这种情况下采取对数据指向链接进行计数的方式,这样就能够基于用户输入的检索关键词得知某一数据指向链接被读取次数最多,基于此进行查询检索内容的输出,输出时可以通过数据库获取部分数据内容进行预览以方便用户确认,相交于享有技术中的多关键词并列时的“与或”逻辑的检索方式,本申请不直接对数据内容进行检索,而是基于关键词检索进行对数据进行一个被检索计数标记,可以有效的避免现有“与或”检索逻辑在检索时因关键词较多导致的数据检索结果偏移,甚至最终无法检索到正确数据内容的问题,本申请中,检索关键词的叠加,只会增加检索计数的准确性,不会导致检索结果的成倍减少。In this embodiment, a data query optimization system is provided. By setting a physical optimization space independent of the database, it can effectively reduce the occupancy of the database data throughput channel and the occupancy of the database computing capacity during the retrieval process. The multi-feature coincidence degree retrieval method of characterization and dispersal of data objects, based on feature merging, can greatly reduce the amount of retrieval data in the retrieval process, improve retrieval efficiency, and achieve a large amount of feature matching data in a short time Objects are combined and screened; specifically, a physical optimization space paralleled and connected to the database is established, and when new data is stored in the database, it is searched for features (data features, such as the keywords of the title, the content keys of the data ontology) Words or high-frequency vocabulary, etc.), establish a data pointing link with the data link in the database and bind it to each feature, and at the same time merge the feature with the same feature in the physical optimization space, so that when searching, When the user enters a retrieval feature, multiple data-pointing links can be obtained based on the retrieval feature. When the user enters multiple retrieval keywords, a data-pointing link will be read multiple times. In this case, take The method of counting data pointing links, so that based on the search keywords entered by the user, it can be known that a certain data pointing link has been read the most times, and based on this, the output of the query and retrieval content can be obtained through the database when outputting part of the data content Previewing is performed to facilitate user confirmation, which intersects with the "and or" logical retrieval method when multiple keywords are juxtaposed in the shared technology. This application does not directly retrieve data content, but conducts a data retrieval based on keyword retrieval. The search count mark can effectively avoid the problem that the existing "and or" retrieval logic causes the data retrieval results to shift due to more keywords during retrieval, and even the correct data content cannot be retrieved in the end. In this application, the retrieval keyword The superposition of , will only increase the accuracy of the retrieval count, and will not lead to a multiplied reduction of the retrieval results.
作为本发明另一个优选的实施例,所述检索空间映射模块300包括:As another preferred embodiment of the present invention, the retrieval
媒体特征获取模块,用于通过预设的媒体对象识别程序对所述媒体内容进行识别,获取所述媒体图像的元素内容构成,并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比,每个所述对照元素包括多个检索特征,相同对照元素的不同检索特征用于区别不同表达方式,所述特征占比用于当进行遍历响应计数时,赋予计数系数。A media feature acquisition module, configured to identify the media content through a preset media object recognition program, acquire the element content composition of the media image, and based on the match between the element content and the comparison element in the preset identification library The degree sets the feature ratio for the corresponding retrieval feature. Each comparison element includes multiple retrieval features. Different retrieval features of the same comparison element are used to distinguish different expressions. The feature ratio is used for traversal response counting When , the count coefficient is given.
本实施例中,这里对检索空间映射模块300补充了媒体特征获取模块,是取决于数据类型而设置的,因为媒体类型的数据(这里主要指的是图像媒体以及视频媒体,其中视频媒体可以视为连续的多组图片媒体)与基本的文本等类型的数据内容不同,其检索特征(即内容的关键词)可以是不确定的,是对于同一图像元素,其关键词的表达方式可以是多种的,同时相近的元素表达方式可能也较为相近,因此需要进行特别的内容元素识别以及关键词的提取,且每个元素对应多个关键词,不同相近的表达方式也应当设置一定的特征占比用于表达其二者的重合程度,以用于在查询检索时进行系数乘算,提升计数过程中产生数据的可信度。In this embodiment, the media feature acquisition module is added to the retrieval
作为本发明另一个优选的实施例,所述特征关联检索模块500包括:As another preferred embodiment of the present invention, the feature
附加筛选单元,用于接收来自用户的附加查询条件,并基于所述附加查询条件对所述数据指向链接进行筛选,所述附加查询条件独立于所述检索特征,且作用于每个所述数据指向链接以及数据库数据,所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。An additional filtering unit, configured to receive additional query conditions from users, and filter the data pointing links based on the additional query conditions, the additional query conditions are independent of the retrieval features and act on each of the data Pointing to links and database data, the additional query conditions include time information, file upload object, file type, and file data volume.
本实施例中,这里的附加筛选单元,可以通过设置附加查询条件来缩小检索的范围,例如设置检索数据对象的时间范围,则可以大量的降低检索数据量,也能够有效的提升检索的成功率。In this embodiment, the additional screening unit here can narrow the scope of retrieval by setting additional query conditions, such as setting the time range of retrieval data objects, which can greatly reduce the amount of retrieval data, and can also effectively improve the success rate of retrieval .
如图2所示,作为本发明另一个优选的实施例,还包括特征模糊模块900,所述特征模糊模块900具体包括:As shown in FIG. 2, as another preferred embodiment of the present invention, a feature
词汇模糊单元901,用于对所述检索特征进行词汇模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值。The
词义模糊单元902,用于对所述检索特征进行词义模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值,所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。The word meaning
本实施例中,特征模糊模块900的作用是在检索过程中对用户输入的词汇进行模糊联想,可以用于当用户对需要检索的数据的内容记忆有误或者不准确时,可以扩大检索范围,提升检索到对应内容的成功率,当然该模块在用户检索不到对应内容时,基于需要进行开启,其中,因用户的错误表达可能包括关键词的文字内容相近以及词义相近两种,故包括词汇模糊单元901以及词义模糊单元902。In this embodiment, the function of the feature
作为本发明另一个优选的实施例,还包括对象关联模块;As another preferred embodiment of the present invention, it also includes an object association module;
所述对象关联模块,用于基于不同用户的检索偏好以及查询确认反馈的最终结果,建立不同用户的检索特征关联树,所述检索特征关联树用于表征用户在通过某一检索特征检索时,被检索对象可能包含的其它用户未被输入检索特征间的关联性。The object association module is used to establish the retrieval feature association tree of different users based on the retrieval preferences of different users and the final result of the query confirmation feedback, and the retrieval feature association tree is used to indicate that when a user searches through a certain retrieval feature, Other users that may be included in the retrieved object are not entered to retrieve the association between the features.
本实施例中,对象关联模块的是针对不同的用户设置的,根据用户的检索查询习惯,可以建立属于该用户的检索关键词的关联树,能够用于提升在检索查询过程中的系统联想模糊准确性。In this embodiment, the object association module is set for different users. According to the user's search and query habits, an association tree of search keywords belonging to the user can be established, which can be used to improve the system association fuzziness in the search and query process. accuracy.
如图3所示,本发明还提供了一种数据库查询优化方法,其包含步骤:As shown in Figure 3, the present invention also provides a kind of database query optimization method, and it comprises steps:
S200,构建检索的物理优化空间,所述物理优化空间包括优化存储单元以及优化检索单元,所述优化存储单元用于存储检索特征,所述优化检索单元用于响应来自用户的数据查询指令以遍历优化存储单元,所述物理优化空间与数据库通信连接。S200, constructing a physical optimization space for retrieval, the physical optimization space includes an optimized storage unit and an optimized retrieval unit, the optimized storage unit is used to store retrieval features, and the optimized retrieval unit is used to respond to data query instructions from users to traverse An optimized storage unit, the physically optimized space is communicatively connected to the database.
S400,建立与数据库中数据一一对应连接的数据指向链接并在所述优化存储单元中存储,通过特征获取程序获取数据库中对应数据的检索特征,将所述检索特征与相对应的所述数据指向链接绑定,并对内容相同的所述检索特征进行合并,所述检索特征包括数据的标题特征、内容特征以及用户标记特征。S400, establish a data pointing link corresponding to the data in the database and store it in the optimized storage unit, obtain the retrieval feature of the corresponding data in the database through the feature acquisition program, and associate the retrieval feature with the corresponding data Point to link binding, and merge the retrieval features with the same content, the retrieval features include data title features, content features and user mark features.
S600,获取来自用户的数据查询指令,所述数据查询指令包含多组检索特征,基于所述检索特征依次对所述优化存储单元进行遍历,获取多个数据指向链接,并通过检索计数器对所述数据指向链接进行遍历响应计数,生成关联检索结果。S600. Obtain a data query instruction from the user, the data query instruction includes multiple sets of retrieval features, sequentially traverse the optimized storage unit based on the retrieval features, obtain multiple data pointing links, and use the retrieval counter to search the optimized storage unit The data points to the link to traverse the response count and generate the associated retrieval result.
S800,基于所述遍历响应计数的结果对所述关联检索结果中的数个数据指向链接进行降序排列,并通过所述数据指向链接获取数据库中对应数据的部分内容以生成验证预览,输出所述验证预览并接收来自用户的查询确认反馈。S800. Arrange in descending order the several data-pointing links in the associated search results based on the result of the traversal response counting, and obtain part of the corresponding data in the database through the data-pointing links to generate a verification preview, and output the Validate previews and receive query confirmation feedback from users.
作为本发明另一个优选的实施例,所述通过特征获取程序获取数据库中对应数据的检索特征的步骤包括:As another preferred embodiment of the present invention, the step of acquiring the retrieval features of the corresponding data in the database through the feature acquisition program includes:
通过预设的媒体对象识别程序对所述媒体内容进行识别,获取所述媒体图像的元素内容构成,并基于所述元素内容与预设识别库中的对照元素的相符程度对相对应的检索特征设置特征占比,每个所述对照元素包括多个检索特征,相同对照元素的不同检索特征用于区别不同表达方式,所述特征占比用于当进行遍历响应计数时,赋予计数系数。Recognize the media content through a preset media object recognition program, obtain the composition of the element content of the media image, and compare the corresponding retrieval features based on the degree of conformity between the element content and the comparison elements in the preset recognition library Set feature proportions, each of the comparison elements includes multiple retrieval features, different retrieval features of the same comparison element are used to distinguish different expressions, and the feature proportions are used to assign counting coefficients when counting traversal responses.
作为本发明另一个优选的实施例,所述生成关联检索结果的步骤中,还包括附加检索步骤:As another preferred embodiment of the present invention, the step of generating associated retrieval results further includes an additional retrieval step:
接收来自用户的附加查询条件,并基于所述附加查询条件对所述数据指向链接进行筛选,所述附加查询条件独立于所述检索特征,且作用于每个所述数据指向链接以及数据库数据,所述附加查询条件包括时间信息、文件上传对象、文件类型以及文件数据量。receiving additional query conditions from the user, and filtering the data-directed links based on the additional query conditions, the additional query conditions are independent of the retrieval feature and act on each of the data-directed links and database data, The additional query conditions include time information, file upload object, file type and file data volume.
作为本发明另一个优选的实施例,还包括步骤:As another preferred embodiment of the present invention, it also includes the steps of:
对所述检索特征进行词汇模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于文字表达与检索特征的重合度对所述模糊检索特征进行特征占比的赋值。Vocabulary fuzzing is performed on the retrieval feature, and the retrieval feature similar to the text expression of the retrieval feature is obtained as a fuzzy retrieval feature, and the feature proportion of the fuzzy retrieval feature is calculated based on the coincidence degree between the text expression and the retrieval feature assignment.
对所述检索特征进行词义模糊,获取与所述检索特征的文字表达相近的所述检索特征作为模糊检索特征,并基于预设的词义模糊等级库对所述模糊检索特征进行特征占比的赋值,所述词义模糊等级库包括多个分别对应不同特征占比的相近词汇存储空间。Fuzzy the word meaning of the retrieval feature, obtain the retrieval feature similar to the text expression of the retrieval feature as the fuzzy retrieval feature, and assign the feature proportion to the fuzzy retrieval feature based on the preset word meaning fuzzy grade library , the word sense fuzzy grade library includes a plurality of similar vocabulary storage spaces respectively corresponding to different feature proportions.
作为本发明另一个优选的实施例,还包括步骤:As another preferred embodiment of the present invention, it also includes the steps of:
基于不同用户的检索偏好以及查询确认反馈的最终结果,建立不同用户的检索特征关联树,所述检索特征关联树用于表征用户在通过某一检索特征检索时,被检索对象可能包含的其它用户未被输入检索特征间的关联性。Based on the retrieval preferences of different users and the final result of the query confirmation feedback, the retrieval feature association tree of different users is established. The retrieval feature association tree is used to represent other users that may be included in the retrieval object when the user searches through a certain retrieval feature Not entered to retrieve associations between features.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink) DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized through computer programs to instruct related hardware, and the programs can be stored in a non-volatile computer-readable storage medium When the program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
本领域技术人员在考虑说明书及实施例处的公开后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art from consideration of the disclosure at the specification and examples. This application is intended to cover any modification, use or adaptation of the present disclosure, and these modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure . The specification and examples are to be considered exemplary only, with the true scope and spirit of the disclosure indicated by the appended claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It should be understood that the present disclosure is not limited to the precise constructions which have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211587212.5A CN115587115B (en) | 2022-12-12 | 2022-12-12 | A database query optimization method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211587212.5A CN115587115B (en) | 2022-12-12 | 2022-12-12 | A database query optimization method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115587115A true CN115587115A (en) | 2023-01-10 |
CN115587115B CN115587115B (en) | 2023-02-28 |
Family
ID=84782998
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211587212.5A Active CN115587115B (en) | 2022-12-12 | 2022-12-12 | A database query optimization method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115587115B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116151370A (en) * | 2023-04-24 | 2023-05-23 | 西南石油大学 | A Model Parameter Optimal Selection System |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295839A1 (en) * | 2010-05-27 | 2011-12-01 | Salesforce.Com, Inc. | Optimizing queries in a multi-tenant database system environment |
CN105912649A (en) * | 2016-04-08 | 2016-08-31 | 南京邮电大学 | Database fuzzy retrieval method and system |
CN107077512A (en) * | 2015-03-28 | 2017-08-18 | 华为技术有限公司 | Systems and methods for optimizing queries on views |
CN109977334A (en) * | 2019-03-26 | 2019-07-05 | 浙江度衍信息技术有限公司 | Retrieval rate optimization method |
CN111339244A (en) * | 2020-02-29 | 2020-06-26 | 山东浪潮通软信息科技有限公司 | Tax policy and regulation inquiry method, computer equipment and storage medium |
CN113407807A (en) * | 2020-12-15 | 2021-09-17 | 腾讯科技(深圳)有限公司 | Query optimization method and device for search engine and electronic equipment |
CN113934869A (en) * | 2021-09-23 | 2022-01-14 | 阿里云计算有限公司 | Database construction method, multimedia file retrieval method and device |
-
2022
- 2022-12-12 CN CN202211587212.5A patent/CN115587115B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295839A1 (en) * | 2010-05-27 | 2011-12-01 | Salesforce.Com, Inc. | Optimizing queries in a multi-tenant database system environment |
CN107077512A (en) * | 2015-03-28 | 2017-08-18 | 华为技术有限公司 | Systems and methods for optimizing queries on views |
CN105912649A (en) * | 2016-04-08 | 2016-08-31 | 南京邮电大学 | Database fuzzy retrieval method and system |
CN109977334A (en) * | 2019-03-26 | 2019-07-05 | 浙江度衍信息技术有限公司 | Retrieval rate optimization method |
CN111339244A (en) * | 2020-02-29 | 2020-06-26 | 山东浪潮通软信息科技有限公司 | Tax policy and regulation inquiry method, computer equipment and storage medium |
CN113407807A (en) * | 2020-12-15 | 2021-09-17 | 腾讯科技(深圳)有限公司 | Query optimization method and device for search engine and electronic equipment |
CN113934869A (en) * | 2021-09-23 | 2022-01-14 | 阿里云计算有限公司 | Database construction method, multimedia file retrieval method and device |
Non-Patent Citations (3)
Title |
---|
J. GRANT ET AL.: "Logic-based query optimization for object databases" * |
樊敏: "基于分布式关系型数据库的查询算法优化", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
罗光红: "分布式面向对象数据库的查询优化及应用研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116151370A (en) * | 2023-04-24 | 2023-05-23 | 西南石油大学 | A Model Parameter Optimal Selection System |
Also Published As
Publication number | Publication date |
---|---|
CN115587115B (en) | 2023-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020057022A1 (en) | Associative recommendation method and apparatus, computer device, and storage medium | |
EP3745276A1 (en) | Discovering a semantic meaning of data fields from profile data of the data fields | |
US10956518B2 (en) | Systems and methods for improved web searching | |
US8812493B2 (en) | Search results ranking using editing distance and document information | |
US9141665B1 (en) | Optimizing search system resource usage and performance using multiple query processing systems | |
CN112989990B (en) | Medical bill identification method, device, equipment and storage medium | |
US8984021B2 (en) | System and method for harvesting electronically stored content by custodian | |
EP2342684A1 (en) | Fuzzy data operations | |
US10169352B2 (en) | System for performing parallel forensic analysis of electronic data and method therefor | |
CN111782595A (en) | Mass file management method, apparatus, computer equipment and readable storage medium | |
CN115587115B (en) | A database query optimization method and system | |
CN116738988A (en) | Text detection method, computer device, and storage medium | |
CN115687463A (en) | Method and system for optimizing data sorting rules of patent retrieval result list | |
CN108763458B (en) | Content characteristic query method, device, computer equipment and storage medium | |
CN118536717A (en) | Carbon footprint accounting system, method, computer device, computer readable storage medium, and computer program product | |
CN113221535A (en) | Information processing method, device, computer equipment and storage medium | |
CN117271713A (en) | Associated object recognition method, associated object recognition device, electronic equipment and storage medium | |
CN117493645B (en) | Big data-based electronic archive recommendation system | |
CN117421565B (en) | Markov blanket-based equipment assessment method and device and computer equipment | |
US10872103B2 (en) | Relevance optimized representative content associated with a data storage system | |
CN115795024B (en) | Intellectual property information display method and system | |
CN110633446B (en) | Webpage column recognition model training method, using method, device and storage medium | |
CN116501831A (en) | Problem recall method, device, equipment and storage medium | |
CN119760107A (en) | Legal database query system and method | |
RU2409849C2 (en) | Method of searching for information in multi-topic unstructured text arrays |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |