CN103744913A - Database retrieval method based on search engine technology - Google Patents

Database retrieval method based on search engine technology Download PDF

Info

Publication number
CN103744913A
CN103744913A CN201310734758.3A CN201310734758A CN103744913A CN 103744913 A CN103744913 A CN 103744913A CN 201310734758 A CN201310734758 A CN 201310734758A CN 103744913 A CN103744913 A CN 103744913A
Authority
CN
China
Prior art keywords
retrieval
database
method based
index
search engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310734758.3A
Other languages
Chinese (zh)
Inventor
劳定雄
吴仲谋
陈刚
蔡青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gosuncn Technology Group Co Ltd
Original Assignee
Gosuncn Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gosuncn Technology Group Co Ltd filed Critical Gosuncn Technology Group Co Ltd
Priority to CN201310734758.3A priority Critical patent/CN103744913A/en
Publication of CN103744913A publication Critical patent/CN103744913A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Abstract

The invention discloses a database retrieval method based on the search engine technology. The method comprises that S1, a retrieval server with a service interface for providing auxiliary rapid retrieval service is set, and the retrieval server obtains data from a relational database and establishes indexes; S2, a client sends a retrieval request to the retrieval server through the service interface; S3, the retrieval server obtains a retrieval result according to retrieval conditions and sends the retrieval result to the client through the server interface; S4, the client processes and displays the returned result. The database retrieval method based on the search engine technology adds the retrieval server under the premise of not affecting the system performance of an original database to achieve a high-efficiency retrieval function.

Description

A kind of database index method based on search engine technique
Technical field
The present invention relates to information retrieval technique, relate in particular to a kind of database index method based on search engine technique.
Background technology
At present the content of relevant database is retrieved, generally write SQL query statement, allow Database Systems carry out.But for recording the more situation of number, for example more than one hundred million, it is very low that database is carried out effectiveness of retrieval.
In order to improve recall precision, the technical method adopting is at present to be all optimized around database itself, comprising: pair often need to set up index as the field of search condition 1.; 2. large database table is carried out to subregion or point list processing (LISP), then improve partial data is carried out to effectiveness of retrieval in conjunction with the optimization of applied logic.
But though establishment index and subregion submeter can promote the efficiency of relevant database being carried out to content retrieval, but Database Systems also need to bear very large cost, are described as follows:
Due to the characteristic of relevant database, after relevant field is set up index, can cause the hydraulic performance decline of other operation of database, for example insert, upgrade, deletion etc.; And, when search condition complicated situation, need to set up index to more field, performance cost will be larger.In addition, after database is carried out to subregion and submeter, if need to retrieve, still to face the performance issue that data merge and sort in overall data.
Summary of the invention
In order to overcome the deficiencies in the prior art, the invention provides a kind of database index method based on search engine technique, it does not affect the performance of legacy data storehouse system, by extra increase retrieval server, utilize the technology of search engine, to the content of database, provide high performance retrieval service.
The present invention adopts following technical scheme:
Based on a database index method for search engine technique, described method comprises:
S1., a retrieval server with service interface is set, and for auxiliary quick-searching service is provided, described retrieval server obtains data and sets up index from relevant database;
S2. client sends retrieval request by service interface to retrieval server;
S3. retrieval server obtains result for retrieval according to search condition and then by service interface, result for retrieval is sent to client;
S4. client is carried out processes and displays to the result of returning.
The present invention mainly carries out index foundation by extra increase retrieval server to the data that newly increase in relevant database, then the data message that allows client want at retrieval server the inside quick obtaining, compared with original speed of directly retrieving in relevant database, greatly improved like this, because relevant database is along with new data enter, when data volume grow can make its handling property suffer from bottleneck to a certain extent time, make retrieval rate become very slow, and that the such retrieval server adopting carry out the request of customer in response end is just very quick.
Wherein, described retrieval server obtains data and sets up index and specifically comprises from relevant database:
S11. retrieval server is according to the time interval of setting, and in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database;
S12. new data are carried out to word segmentation processing;
S13. according to the relation of inclusion of participle and document, set up inverted index, and in the index file of write structure compactness.
Wherein, for step S13, when index file is increased to a certain degree, it is merged.
Integrating step S13, step S4 mention client to the result of returning carry out processes and displays specifically the present invention after new data have been done to participle, only retain inverted index, do not retain original full detail, to send to the result for retrieval of client be that data meet search condition and are recorded in the positional information in relevant database to retrieval server like this, and then client is removed the detailed data message of quick obtaining in relevant database according to this positional information again.But, this just one of configuration mode, retains in retrieval server and returns to which raw information, and this can configure according to actual needs, also be the full detail that retrieval server also can reserving refreshing data, then according to search condition, full detail sent to client.
In addition, the present invention adopts corresponding indexing means to the data type of different field, and to text, to setting up index according to participle again after its participle, logarithm value adopts binary tree to store.
Wherein, for step S3, according to search condition, search the content satisfying condition in indexed file, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts.
In order to reach high performance object, need index file to be remained in the internal memory of retrieval server as far as possible, except the storage format of appropriate design index file, the present invention is also provided with caching mechanism, the result for retrieval that comprises interval in search condition is cached in the internal memory of retrieval server, makes follow-up similar inquiry can obtain response faster.
The beneficial effect that the present invention compared with prior art has is:
1. the extra retrieval server increasing is not affecting under the prerequisite of existing client end, realizes high efficiency search function.
2. that similar to search can be obtained is corresponding faster for good caching mechanism.
Accompanying drawing explanation
Fig. 1: process flow diagram of the present invention.
Embodiment
Below in conjunction with accompanying drawing, the invention will be further described.
Shown in Fig. 1, a kind of database index method based on search engine technique, described method comprises:
S1., a retrieval server with service interface is set, for auxiliary quick-searching service is provided, described retrieval server is according to the time interval of setting, in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database, new data are carried out, after word segmentation processing, according to the relation of inclusion of participle and document, setting up inverted index, and in the index file of write structure compactness.When index file is increased to a certain degree, it is merged;
S2. application software sends retrieval request by service interface to retrieval server;
S3. retrieval server is according to search condition, in indexed file, search the content satisfying condition, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts, and then obtains result for retrieval and then by service interface, result for retrieval is sent to client;
S4. application software is carried out processes and displays to the result of returning.
In addition, the method is also provided with caching mechanism, the result for retrieval that comprises interval is cached in the internal memory of retrieval server in search condition, makes follow-up similar inquiry can obtain response faster.
By practical operation, the in the situation that of recording number reach 3.8 hundred million in database table, utilize original method, Database Systems to be inquired about, the response time is 236 seconds.And utilize technical scheme of the present invention, response time, be 9 seconds, greatly shortened retrieval time.

Claims (5)

1. the database index method based on search engine technique, is characterized in that, described method comprises:
S1., a retrieval server with service interface is set, and for auxiliary quick-searching service is provided, described retrieval server obtains data and sets up index from relevant database;
S2. client sends retrieval request by service interface to retrieval server;
S3. retrieval server obtains result for retrieval according to search condition and then by service interface, result for retrieval is sent to client;
S4. client is carried out processes and displays to the result of returning.
2. a kind of database index method based on search engine technique according to claim 1, is characterized in that, described retrieval server obtains data and sets up index and specifically comprises from relevant database:
S11. retrieval server is according to the time interval of setting, and in conjunction with the last timestamp that obtains data, the data of renewal are obtained in timing from relevant database;
S12. new data are carried out to word segmentation processing;
S13. according to the relation of inclusion of participle and document, set up inverted index, and in the index file of write structure compactness.
3. a kind of database index method based on search engine technique according to claim 2, is characterized in that, when index file is increased to a certain degree, it is merged.
4. a kind of database index method based on search engine technique according to claim 1, it is characterized in that, according to search condition, in indexed file, search the content satisfying condition, generally according to correlativity, it is sorted, the condition that also can specify according to user sorts.
5. a kind of database index method based on search engine technique according to claim 2, it is characterized in that, also be provided with caching mechanism, the result for retrieval that comprises interval in search condition is cached in the internal memory of retrieval server, makes follow-up similar inquiry can obtain response faster.
CN201310734758.3A 2013-12-27 2013-12-27 Database retrieval method based on search engine technology Pending CN103744913A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310734758.3A CN103744913A (en) 2013-12-27 2013-12-27 Database retrieval method based on search engine technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310734758.3A CN103744913A (en) 2013-12-27 2013-12-27 Database retrieval method based on search engine technology

Publications (1)

Publication Number Publication Date
CN103744913A true CN103744913A (en) 2014-04-23

Family

ID=50501931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310734758.3A Pending CN103744913A (en) 2013-12-27 2013-12-27 Database retrieval method based on search engine technology

Country Status (1)

Country Link
CN (1) CN103744913A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182546A (en) * 2014-09-09 2014-12-03 北京国双科技有限公司 Method and device for querying data in databases
CN105320754A (en) * 2015-10-08 2016-02-10 上海瀚银信息技术有限公司 Data searching system and method
CN105912545A (en) * 2015-12-15 2016-08-31 乐视网信息技术(北京)股份有限公司 Device, method, and system for media resource retrieval
CN106649804A (en) * 2016-12-29 2017-05-10 深圳市优必选科技有限公司 Data processing method, data processing device and data processing system for data query server
CN106777088A (en) * 2016-12-13 2017-05-31 飞狐信息技术(天津)有限公司 The method for sequencing search engines and system of iteratively faster
CN106855890A (en) * 2017-01-09 2017-06-16 广州巨杉软件开发有限公司 A kind of method for realizing the final consistency full-text search of high-performance data storehouse
CN106920192A (en) * 2016-12-29 2017-07-04 广州途威慧信息科技有限公司 A kind of educational counseling management system
CN107103011A (en) * 2016-02-23 2017-08-29 阿里巴巴集团控股有限公司 The implementation method and device of terminal data search
CN107729518A (en) * 2017-10-26 2018-02-23 山东浪潮云服务信息科技有限公司 The text searching method and device of a kind of relevant database
CN110895538A (en) * 2018-09-13 2020-03-20 深圳市蓝灯鱼智能科技有限公司 Data retrieval method, device, storage medium and processor
CN111259193A (en) * 2020-01-16 2020-06-09 高新兴科技集团股份有限公司 Feature retrieval system based on clustering filtration and application method thereof

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104182546A (en) * 2014-09-09 2014-12-03 北京国双科技有限公司 Method and device for querying data in databases
CN104182546B (en) * 2014-09-09 2017-10-27 北京国双科技有限公司 The data query method and device of database
CN105320754A (en) * 2015-10-08 2016-02-10 上海瀚银信息技术有限公司 Data searching system and method
WO2017101425A1 (en) * 2015-12-15 2017-06-22 乐视控股(北京)有限公司 Apparatus, method and system for use in retrieval of media resources
CN105912545A (en) * 2015-12-15 2016-08-31 乐视网信息技术(北京)股份有限公司 Device, method, and system for media resource retrieval
CN107103011A (en) * 2016-02-23 2017-08-29 阿里巴巴集团控股有限公司 The implementation method and device of terminal data search
CN106777088A (en) * 2016-12-13 2017-05-31 飞狐信息技术(天津)有限公司 The method for sequencing search engines and system of iteratively faster
CN106649804A (en) * 2016-12-29 2017-05-10 深圳市优必选科技有限公司 Data processing method, data processing device and data processing system for data query server
CN106920192A (en) * 2016-12-29 2017-07-04 广州途威慧信息科技有限公司 A kind of educational counseling management system
CN106855890A (en) * 2017-01-09 2017-06-16 广州巨杉软件开发有限公司 A kind of method for realizing the final consistency full-text search of high-performance data storehouse
CN106855890B (en) * 2017-01-09 2020-07-28 深圳巨杉数据库软件有限公司 Method for realizing final consistency full-text retrieval of high-performance database
CN107729518A (en) * 2017-10-26 2018-02-23 山东浪潮云服务信息科技有限公司 The text searching method and device of a kind of relevant database
CN110895538A (en) * 2018-09-13 2020-03-20 深圳市蓝灯鱼智能科技有限公司 Data retrieval method, device, storage medium and processor
CN111259193A (en) * 2020-01-16 2020-06-09 高新兴科技集团股份有限公司 Feature retrieval system based on clustering filtration and application method thereof
CN111259193B (en) * 2020-01-16 2023-08-25 高新兴科技集团股份有限公司 Feature retrieval system based on cluster filtering and application method thereof

Similar Documents

Publication Publication Date Title
CN103744913A (en) Database retrieval method based on search engine technology
TWI682285B (en) Product, method, and machine readable medium for kvs tree database
CN102521406B (en) Distributed query method and system for complex task of querying massive structured data
CN104572670B (en) A kind of storage of small documents, inquiry and delet method and system
CN104252536B (en) A kind of internet log data query method and device based on hbase
CN104424258B (en) Multidimensional data query method, query server, column storage server and system
CN102890722B (en) Indexing method applied to time sequence historical database
CN107368527B (en) Multi-attribute index method based on data stream
CN105677826A (en) Resource management method for massive unstructured data
CN102467572B (en) Data block inquiring method for supporting data de-duplication program
CN106528847A (en) Multi-dimensional processing method and system for massive data
CN103488704A (en) Method and device for storing data
CN103678491A (en) Method based on Hadoop small file optimization and reverse index establishment
CN103595797B (en) Caching method for distributed storage system
CN103366015A (en) OLAP (on-line analytical processing) data storage and query method based on Hadoop
WO2019105420A1 (en) Data query
CN105117417A (en) Read-optimized memory database Trie tree index method
CN107357843B (en) Massive network data searching method based on data stream structure
TW201415262A (en) Construction of inverted index system, data processing method and device based on Lucene
CN103544261A (en) Method and device for managing global indexes of mass structured log data
CN104239377A (en) Platform-crossing data retrieval method and device
CN103678694A (en) Method and system for establishing reverse index file of video resources
CN104077405A (en) Sequential type data accessing method
CN103440245A (en) Line and column hybrid storage method of database system
CN106528649A (en) Massive data storage and retrieval system and massive data storage and retrieval methods for new energy vehicles

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140423