CN103744913A - Database retrieval method based on search engine technology - Google Patents
Database retrieval method based on search engine technology Download PDFInfo
- Publication number
- CN103744913A CN103744913A CN201310734758.3A CN201310734758A CN103744913A CN 103744913 A CN103744913 A CN 103744913A CN 201310734758 A CN201310734758 A CN 201310734758A CN 103744913 A CN103744913 A CN 103744913A
- Authority
- CN
- China
- Prior art keywords
- retrieval
- database
- method based
- index
- search engine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Abstract
The invention discloses a database retrieval method based on the search engine technology. The method comprises that S1, a retrieval server with a service interface for providing auxiliary rapid retrieval service is set, and the retrieval server obtains data from a relational database and establishes indexes; S2, a client sends a retrieval request to the retrieval server through the service interface; S3, the retrieval server obtains a retrieval result according to retrieval conditions and sends the retrieval result to the client through the server interface; S4, the client processes and displays the returned result. The database retrieval method based on the search engine technology adds the retrieval server under the premise of not affecting the system performance of an original database to achieve a high-efficiency retrieval function.
Description
Technical field
The present invention relates to information retrieval technique, relate in particular to a kind of database index method based on search engine technique.
Background technology
At present the content of relevant database is retrieved, generally write SQL query statement, allow Database Systems carry out.But for recording the more situation of number, for example more than one hundred million, it is very low that database is carried out effectiveness of retrieval.
In order to improve recall precision, the technical method adopting is at present to be all optimized around database itself, comprising: pair often need to set up index as the field of search condition 1.; 2. large database table is carried out to subregion or point list processing (LISP), then improve partial data is carried out to effectiveness of retrieval in conjunction with the optimization of applied logic.
But though establishment index and subregion submeter can promote the efficiency of relevant database being carried out to content retrieval, but Database Systems also need to bear very large cost, are described as follows:
Due to the characteristic of relevant database, after relevant field is set up index, can cause the hydraulic performance decline of other operation of database, for example insert, upgrade, deletion etc.; And, when search condition complicated situation, need to set up index to more field, performance cost will be larger.In addition, after database is carried out to subregion and submeter, if need to retrieve, still to face the performance issue that data merge and sort in overall data.
Summary of the invention
In order to overcome the deficiencies in the prior art, the invention provides a kind of database index method based on search engine technique, it does not affect the performance of legacy data storehouse system, by extra increase retrieval server, utilize the technology of search engine, to the content of database, provide high performance retrieval service.
The present invention adopts following technical scheme:
Based on a database index method for search engine technique, described method comprises:
S1., a retrieval server with service interface is set, and for auxiliary quick-searching service is provided, described retrieval server obtains data and sets up index from relevant database;
S2. client sends retrieval request by service interface to retrieval server;
S3. retrieval server obtains result for retrieval according to search condition and then by service interface, result for retrieval is sent to client;
S4. client is carried out processes and displays to the result of returning.
The present invention mainly carries out index foundation by extra increase retrieval server to the data that newly increase in relevant database, then the data message that allows client want at retrieval server the inside quick obtaining, compared with original speed of directly retrieving in relevant database, greatly improved like this, because relevant database is along with new data enter, when data volume grow can make its handling property suffer from bottleneck to a certain extent time, make retrieval rate become very slow, and that the such retrieval server adopting carry out the request of customer in response end is just very quick.
Wherein, described retrieval server obtains data and sets up index and specifically comprises from relevant database:
S11. retrieval server is according to the time interval of setting, and in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database;
S12. new data are carried out to word segmentation processing;
S13. according to the relation of inclusion of participle and document, set up inverted index, and in the index file of write structure compactness.
Wherein, for step S13, when index file is increased to a certain degree, it is merged.
Integrating step S13, step S4 mention client to the result of returning carry out processes and displays specifically the present invention after new data have been done to participle, only retain inverted index, do not retain original full detail, to send to the result for retrieval of client be that data meet search condition and are recorded in the positional information in relevant database to retrieval server like this, and then client is removed the detailed data message of quick obtaining in relevant database according to this positional information again.But, this just one of configuration mode, retains in retrieval server and returns to which raw information, and this can configure according to actual needs, also be the full detail that retrieval server also can reserving refreshing data, then according to search condition, full detail sent to client.
In addition, the present invention adopts corresponding indexing means to the data type of different field, and to text, to setting up index according to participle again after its participle, logarithm value adopts binary tree to store.
Wherein, for step S3, according to search condition, search the content satisfying condition in indexed file, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts.
In order to reach high performance object, need index file to be remained in the internal memory of retrieval server as far as possible, except the storage format of appropriate design index file, the present invention is also provided with caching mechanism, the result for retrieval that comprises interval in search condition is cached in the internal memory of retrieval server, makes follow-up similar inquiry can obtain response faster.
The beneficial effect that the present invention compared with prior art has is:
1. the extra retrieval server increasing is not affecting under the prerequisite of existing client end, realizes high efficiency search function.
2. that similar to search can be obtained is corresponding faster for good caching mechanism.
Accompanying drawing explanation
Fig. 1: process flow diagram of the present invention.
Embodiment
Below in conjunction with accompanying drawing, the invention will be further described.
Shown in Fig. 1, a kind of database index method based on search engine technique, described method comprises:
S1., a retrieval server with service interface is set, for auxiliary quick-searching service is provided, described retrieval server is according to the time interval of setting, in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database, new data are carried out, after word segmentation processing, according to the relation of inclusion of participle and document, setting up inverted index, and in the index file of write structure compactness.When index file is increased to a certain degree, it is merged;
S2. application software sends retrieval request by service interface to retrieval server;
S3. retrieval server is according to search condition, in indexed file, search the content satisfying condition, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts, and then obtains result for retrieval and then by service interface, result for retrieval is sent to client;
S4. application software is carried out processes and displays to the result of returning.
In addition, the method is also provided with caching mechanism, the result for retrieval that comprises interval is cached in the internal memory of retrieval server in search condition, makes follow-up similar inquiry can obtain response faster.
By practical operation, the in the situation that of recording number reach 3.8 hundred million in database table, utilize original method, Database Systems to be inquired about, the response time is 236 seconds.And utilize technical scheme of the present invention, response time, be 9 seconds, greatly shortened retrieval time.
Claims (5)
1. the database index method based on search engine technique, is characterized in that, described method comprises:
S1., a retrieval server with service interface is set, and for auxiliary quick-searching service is provided, described retrieval server obtains data and sets up index from relevant database;
S2. client sends retrieval request by service interface to retrieval server;
S3. retrieval server obtains result for retrieval according to search condition and then by service interface, result for retrieval is sent to client;
S4. client is carried out processes and displays to the result of returning.
2. a kind of database index method based on search engine technique according to claim 1, is characterized in that, described retrieval server obtains data and sets up index and specifically comprises from relevant database:
S11. retrieval server is according to the time interval of setting, and in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database;
S12. new data are carried out to word segmentation processing;
S13. according to the relation of inclusion of participle and document, set up inverted index, and in the index file of write structure compactness.
3. a kind of database index method based on search engine technique according to claim 2, is characterized in that, when index file is increased to a certain degree, it is merged.
4. a kind of database index method based on search engine technique according to claim 1, it is characterized in that, according to search condition, in indexed file, search the content satisfying condition, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts.
5. a kind of database index method based on search engine technique according to claim 2, it is characterized in that, also be provided with caching mechanism, the result for retrieval that comprises interval in search condition is cached in the internal memory of retrieval server, makes follow-up similar inquiry can obtain response faster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310734758.3A CN103744913A (en) | 2013-12-27 | 2013-12-27 | Database retrieval method based on search engine technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310734758.3A CN103744913A (en) | 2013-12-27 | 2013-12-27 | Database retrieval method based on search engine technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103744913A true CN103744913A (en) | 2014-04-23 |
Family
ID=50501931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310734758.3A Pending CN103744913A (en) | 2013-12-27 | 2013-12-27 | Database retrieval method based on search engine technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103744913A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104182546A (en) * | 2014-09-09 | 2014-12-03 | 北京国双科技有限公司 | Method and device for querying data in databases |
CN105320754A (en) * | 2015-10-08 | 2016-02-10 | 上海瀚银信息技术有限公司 | Data searching system and method |
CN105912545A (en) * | 2015-12-15 | 2016-08-31 | 乐视网信息技术(北京)股份有限公司 | Device, method, and system for media resource retrieval |
CN106649804A (en) * | 2016-12-29 | 2017-05-10 | 深圳市优必选科技有限公司 | Data processing method, data processing device and data processing system for data query server |
CN106777088A (en) * | 2016-12-13 | 2017-05-31 | 飞狐信息技术(天津)有限公司 | The method for sequencing search engines and system of iteratively faster |
CN106855890A (en) * | 2017-01-09 | 2017-06-16 | 广州巨杉软件开发有限公司 | A kind of method for realizing the final consistency full-text search of high-performance data storehouse |
CN106920192A (en) * | 2016-12-29 | 2017-07-04 | 广州途威慧信息科技有限公司 | A kind of educational counseling management system |
CN107103011A (en) * | 2016-02-23 | 2017-08-29 | 阿里巴巴集团控股有限公司 | The implementation method and device of terminal data search |
CN107729518A (en) * | 2017-10-26 | 2018-02-23 | 山东浪潮云服务信息科技有限公司 | The text searching method and device of a kind of relevant database |
CN110895538A (en) * | 2018-09-13 | 2020-03-20 | 深圳市蓝灯鱼智能科技有限公司 | Data retrieval method, device, storage medium and processor |
CN111259193A (en) * | 2020-01-16 | 2020-06-09 | 高新兴科技集团股份有限公司 | Feature retrieval system based on clustering filtration and application method thereof |
-
2013
- 2013-12-27 CN CN201310734758.3A patent/CN103744913A/en active Pending
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104182546A (en) * | 2014-09-09 | 2014-12-03 | 北京国双科技有限公司 | Method and device for querying data in databases |
CN104182546B (en) * | 2014-09-09 | 2017-10-27 | 北京国双科技有限公司 | The data query method and device of database |
CN105320754A (en) * | 2015-10-08 | 2016-02-10 | 上海瀚银信息技术有限公司 | Data searching system and method |
WO2017101425A1 (en) * | 2015-12-15 | 2017-06-22 | 乐视控股(北京)有限公司 | Apparatus, method and system for use in retrieval of media resources |
CN105912545A (en) * | 2015-12-15 | 2016-08-31 | 乐视网信息技术(北京)股份有限公司 | Device, method, and system for media resource retrieval |
CN107103011A (en) * | 2016-02-23 | 2017-08-29 | 阿里巴巴集团控股有限公司 | The implementation method and device of terminal data search |
CN106777088A (en) * | 2016-12-13 | 2017-05-31 | 飞狐信息技术(天津)有限公司 | The method for sequencing search engines and system of iteratively faster |
CN106649804A (en) * | 2016-12-29 | 2017-05-10 | 深圳市优必选科技有限公司 | Data processing method, data processing device and data processing system for data query server |
CN106920192A (en) * | 2016-12-29 | 2017-07-04 | 广州途威慧信息科技有限公司 | A kind of educational counseling management system |
CN106855890A (en) * | 2017-01-09 | 2017-06-16 | 广州巨杉软件开发有限公司 | A kind of method for realizing the final consistency full-text search of high-performance data storehouse |
CN106855890B (en) * | 2017-01-09 | 2020-07-28 | 深圳巨杉数据库软件有限公司 | Method for realizing final consistency full-text retrieval of high-performance database |
CN107729518A (en) * | 2017-10-26 | 2018-02-23 | 山东浪潮云服务信息科技有限公司 | The text searching method and device of a kind of relevant database |
CN110895538A (en) * | 2018-09-13 | 2020-03-20 | 深圳市蓝灯鱼智能科技有限公司 | Data retrieval method, device, storage medium and processor |
CN111259193A (en) * | 2020-01-16 | 2020-06-09 | 高新兴科技集团股份有限公司 | Feature retrieval system based on clustering filtration and application method thereof |
CN111259193B (en) * | 2020-01-16 | 2023-08-25 | 高新兴科技集团股份有限公司 | Feature retrieval system based on cluster filtering and application method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103744913A (en) | Database retrieval method based on search engine technology | |
TWI682285B (en) | Product, method, and machine readable medium for kvs tree database | |
CN102521406B (en) | Distributed query method and system for complex task of querying massive structured data | |
CN104572670B (en) | A kind of storage of small documents, inquiry and delet method and system | |
CN104252536B (en) | A kind of internet log data query method and device based on hbase | |
CN104424258B (en) | Multidimensional data query method, query server, column storage server and system | |
CN102890722B (en) | Indexing method applied to time sequence historical database | |
CN107368527B (en) | Multi-attribute index method based on data stream | |
CN105677826A (en) | Resource management method for massive unstructured data | |
CN102467572B (en) | Data block inquiring method for supporting data de-duplication program | |
CN106528847A (en) | Multi-dimensional processing method and system for massive data | |
CN103488704A (en) | Method and device for storing data | |
CN103678491A (en) | Method based on Hadoop small file optimization and reverse index establishment | |
CN103595797B (en) | Caching method for distributed storage system | |
CN103366015A (en) | OLAP (on-line analytical processing) data storage and query method based on Hadoop | |
WO2019105420A1 (en) | Data query | |
CN105117417A (en) | Read-optimized memory database Trie tree index method | |
CN107357843B (en) | Massive network data searching method based on data stream structure | |
TW201415262A (en) | Construction of inverted index system, data processing method and device based on Lucene | |
CN103544261A (en) | Method and device for managing global indexes of mass structured log data | |
CN104239377A (en) | Platform-crossing data retrieval method and device | |
CN103678694A (en) | Method and system for establishing reverse index file of video resources | |
CN104077405A (en) | Sequential type data accessing method | |
CN103440245A (en) | Line and column hybrid storage method of database system | |
CN106528649A (en) | Massive data storage and retrieval system and massive data storage and retrieval methods for new energy vehicles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140423 |