CN105468758A - Data retrieval method and device - Google Patents

Data retrieval method and device Download PDF

Info

Publication number
CN105468758A
CN105468758A CN201510857487.XA CN201510857487A CN105468758A CN 105468758 A CN105468758 A CN 105468758A CN 201510857487 A CN201510857487 A CN 201510857487A CN 105468758 A CN105468758 A CN 105468758A
Authority
CN
China
Prior art keywords
index database
current
reconstructed
data
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510857487.XA
Other languages
Chinese (zh)
Other versions
CN105468758B (en
Inventor
虞航仲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201510857487.XA priority Critical patent/CN105468758B/en
Publication of CN105468758A publication Critical patent/CN105468758A/en
Application granted granted Critical
Publication of CN105468758B publication Critical patent/CN105468758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data retrieval method and device. The data retrieval method is applied to a data retrieval device, and comprises the following steps: constructing two identical index libraries corresponding to basic data serving as retrieval objects; when auxiliary data serving as a retrieval object is obtained, determining a first index base to be reconstructed from at least one index base which is not currently subjected to data retrieval; reconstructing the first index base according to the current corresponding retrieval object of the first index base and the auxiliary data obtained this time; the method further comprises the following steps: when a data retrieval request is obtained, determining a second index library to be utilized from at least one index library which is not currently executed with reconstruction operation; and determining a retrieval result corresponding to the data retrieval request based on the second index library and the corresponding retrieval object. By the aid of the scheme, influence on response to the data retrieval request can be avoided in the process of updating the index database by the aid of the auxiliary data.

Description

Data retrieval method and device
Technical field
The present invention relates to data retrieval technology field, particularly relate to data retrieval method and device.
Background technology
In order to improve recall precision, data searcher can be searching object index building storehouse usually, and then carry out data retrieval based on index database, wherein, index database is generally: some information extracted from searching object formed as index information tissue.For example: for one section of document, corresponding index information is the property parameters of word content or the document extracted from the document, and the property parameters of document can be: author's name, document classification, etc.
Wherein, data searcher, when starting, can build the index database corresponding to the current existing basic data as searching object, and then perform follow-up data retrieval based on this index database; And, because data resource can be on the increase, need to increase auxiliary data and improve existing searching object, now, corresponding index database also needs to upgrade, such as: for the data searcher of Baidu, Google etc., because the Internet resources of every day constantly increase, this makes to need constantly to increase auxiliary data to improve searching object undoubtedly, and so, corresponding index database also just needs to constantly update.In prior art, data searcher is after obtaining the auxiliary data increased, each all based on increased auxiliary data and current existing searching object, current existing index database is reconstructed, thus after reconstruct completes, follow-uply carry out data retrieval based on reconstructed index database.
Although existing mode can ensure the effective corresponding of index database and searching object, each obtain auxiliary data after reconstruct is performed to old index database, cause undoubtedly affecting the response to data retrieval request at no point in the update process.
Summary of the invention
The object of the embodiment of the present invention is to provide a kind of data retrieval method and device, to avoid affecting the response to data retrieval request utilizing auxiliary data to upgrade in index database process.Concrete technical scheme is as follows:
First aspect, embodiments provides a kind of data retrieval method, is applied to data searcher, and described method comprises:
Build as the identical index database of two corresponding to the basic data of searching object;
When obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
According to the current corresponding searching object of described first index database and this auxiliary data obtained, reconstruct described first index database;
Described method also comprises:
When obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Based on described second index database and corresponding searching object, determine the result for retrieval corresponding to described data retrieval request.
Optionally, not describedly to be performed at least one index database of data retrieval from current, to determine the first index database to be reconstructed, comprising:
If the index database not being performed data retrieval current is two, and Stochastic choice index database is as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, comprising:
If the index database not being performed reconstructed operation current is two, and Stochastic choice index database is as the second index database to be utilized;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Optionally, not describedly to be performed at least one index database of data retrieval from current, to determine the first index database to be reconstructed, comprising:
If the index database not being performed data retrieval current is two, the auxiliary data do not obtained according to the last time is reconstructed the index database of operation as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, comprising:
If the index database not being performed reconstructed operation current is two, judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, Stochastic choice index database is as the second index database to be utilized, otherwise, the auxiliary data obtained is reconstructed the index database of operation as the second index database to be utilized according to the last time;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Optionally, described based on described second index database and corresponding searching object, determine the result for retrieval corresponding to described data retrieval request, comprising:
From described second index database, determine whether there is the index information matched with term entrained by described data retrieval request, if existed, from the current corresponding searching object of described second index database, determine the result for retrieval corresponding to described data retrieval request.
Optionally, the mode obtained as the auxiliary data of searching object comprises:
The mode of reptile timing uploading data Network Based, obtains the auxiliary data as searching object;
Or,
Based on the mode of timing from web crawlers request msg, obtain the auxiliary data as searching object.
Optionally, the mode obtained as the auxiliary data of searching object comprises:
Based on manually importing data mode, obtain the auxiliary data as searching object.
Optionally, the mode reconstructing described first index database is identical with the mode built as the identical index database of two corresponding to the basic data of searching object.
Optionally, the mode reconstructing described first index database is with the mode built as the identical index database of two corresponding to the basic data of searching object: inverted list mode.
Second aspect, embodiments provides a kind of data searcher, comprising:
Index database builds module, for building as the identical index database of two corresponding to the basic data of searching object;
Treat reconstruct index database determination module, for when obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
Index database reconstructed module, for according to the current corresponding searching object of described first index database and this auxiliary data obtained, reconstructs described first index database;
Described device also comprises:
Index database determination module to be utilized, for when obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Indexed results determination module, for based on described second index database and corresponding searching object, determines the result for retrieval corresponding to described data retrieval request.
Optionally, treat described in that reconstruct index database determination module comprises:
First treats reconstruct index database determining unit, and for when obtaining the auxiliary data as searching object, the index database not being performed data retrieval if current is two, and Stochastic choice index database is as the first index database to be reconstructed; If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Described index database determination module to be utilized comprises:
First index database determining unit to be utilized, for when obtaining data retrieval request, the index database not being performed reconstructed operation if current is two, and Stochastic choice index database is as the second index database to be utilized; If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Optionally, treat described in that reconstruct index database determination module comprises:
Second treats reconstruct index database determining unit, for when obtaining the auxiliary data as searching object, if the index database not being performed data retrieval current is two, the auxiliary data do not obtained according to the last time is reconstructed the index database of operation as the first index database to be reconstructed; If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Described index database determination module to be utilized comprises:
Second index database determining unit to be utilized, for when obtaining data retrieval request, if the index database not being performed reconstructed operation current is two, judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, Stochastic choice index database as the second index database to be utilized, otherwise, the auxiliary data obtained according to the last time is reconstructed the index database of operation as the second index database to be utilized; If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Optionally, described indexed results determination module comprises:
Indexed results determining unit, for from described second index database, determine whether there is the index information matched with term entrained by described data retrieval request, if existed, from the current corresponding searching object of described second index database, determine the result for retrieval corresponding to described data retrieval request.
Optionally, treat described in that the mode that reconstruct index database determination module obtains as the auxiliary data of searching object comprises:
The mode of reptile timing uploading data Network Based, obtains the auxiliary data as searching object;
Or,
Based on the mode of timing from web crawlers request msg, obtain the auxiliary data as searching object.
Optionally, treat described in that the mode that reconstruct index database determination module obtains as the auxiliary data of searching object comprises:
Based on manually importing data mode, obtain the auxiliary data as searching object.
Optionally, mode and the described index database of described first index database of described index database reconstructed module reconstruct build module construction as searching object basic data corresponding to the mode of two identical index databases identical.
Optionally, mode and the described index database of described first index database of described index database reconstructed module reconstruct build module construction as searching object basic data corresponding to the mode of two identical index databases be: inverted list mode.
Compared with prior art, this programme builds two index databases in advance, and after acquisition auxiliary data, from two index databases, determine current at least one index database not being performed data retrieval, be not performed at least one index database of data retrieval from current, determine the first index database to be reconstructed, and according to the current corresponding searching object of the first index database and this auxiliary data obtained, reconstruct this first index database; Further, when obtaining data retrieval request, from two index databases, current at least one index database not being performed reconstructed operation is determined; Be not performed at least one index database of reconstructed operation from current, determine the second index database to be utilized; Based on this second index database and corresponding searching object, determine the result for retrieval corresponding to this data retrieval request, finally achieve and avoid affecting the object to the response of data retrieval request utilizing auxiliary data to upgrade in index database process.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The process flow diagram of a kind of data retrieval method that Fig. 1 provides for the embodiment of the present invention;
The structural representation of a kind of data searcher that Fig. 2 provides for the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In order to avoid affecting the response to data retrieval request, embodiments provide a kind of data retrieval method and device utilizing auxiliary data to upgrade in index database process.
First a kind of data retrieval method that the embodiment of the present invention provides is introduced below.
It should be noted that, a kind of data retrieval method that the embodiment of the present invention provides is applied to data searcher.
As shown in Figure 1, a kind of data retrieval method that the embodiment of the present invention provides, can comprise the steps:
S101, builds as the identical index database of two corresponding to the basic data of searching object;
This data searcher, when starting, can adopt predetermined building mode, build as the identical index database of two corresponding to the basic data of searching object; Further, be understandable that, the acquisition pattern of basic data can adopt prior art to obtain, such as: artificial lead-in mode, or, crawl mode by web crawlers, etc., this is all rational.
Wherein, predetermined building mode can adopt existing mode, for example: this predetermined building mode can be inverted list mode, etc.
" assisting " of it is emphasized that in " basis " and " auxiliary data " in so-called " basic data " is only used to the data as searching object do not existed in the same time, does not have any limiting meaning; Similar, " second " in " first " in " the first index database " and " the second index database ", just to distinguishing index database to be reconstructed and index database to be utilized, does not have any limiting meaning.
S102, when obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
In order to enrich searching object, repeatedly can obtain the auxiliary data as searching object, and, when obtaining the auxiliary data as searching object at every turn, in order to ensure not affect data retrieval and reconstruct index database, can not be performed at least one index database of data retrieval from current, determine the first index database to be reconstructed; Wherein, first index database to be reconstructed is the current index database not being performed data retrieval, ensure in the first index database restructuring procedure with this, another index database can as the foundation of data retrieval request, to avoid affecting the response to data retrieval request utilizing auxiliary data to upgrade in index database process.Further, each index database correspondence can be provided with status identifier, can learn whether index database was reconstructed and current status by this status identifier, wherein, current status comprises: be performed reconstituted state, be performed data retrieval state and idle condition, described idle condition for not both being performed reconstruct was not performed data retrieval yet, for example: status identifier 000 represents and was not performed reconstruct and the current idle condition that belongs to; Status identifier 010 represent be not performed reconstruct and be performed data retrieval state current belonging to; Status identifier 100 represents and was performed reconstruct and currently belonged to idle condition; Status identifier 110 represents and was performed reconstruct and was performed data retrieval state current belonging to, and status identifier 101 represents and was performed reconstruct and currently belonged to the state of being reconstructed, etc.
Concrete, in one implementation, not describedly to be performed at least one index database of data retrieval from current, to determine the first index database to be reconstructed, can comprise:
If the index database not being performed data retrieval current is two, and Stochastic choice index database is as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed.
In the second implementation, in order to ensure that two index databases all have higher availability, two index databases are reconstructed in turn, based on this kind of thought, describedly not to be performed at least one index database of data retrieval from current, to determine the first index database to be reconstructed, can comprise:
If the index database not being performed data retrieval current is two, the auxiliary data do not obtained according to the last time is reconstructed the index database of operation as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Be understandable that, in the second implementation, the index database not being performed data retrieval if current is two and two index databases were not all performed reconstruct, can Stochastic choice index database as the first index database to be reconstructed.
In addition, it should be noted that, the mode obtained as the auxiliary data of searching object exists multiple, in order to scheme is clear, introduces several specific implementation below:
Mode one, the mode of reptile timing uploading data Network Based, obtains the auxiliary data as searching object.
Wherein, in this kind of implementation, web crawlers to swash the auxiliary data of taking in as searching object at network according to the predetermined task that crawls, and, timing by the auxiliary data that crawls be uploaded to data searcher; The mode that so-called web crawlers crawls auxiliary data on network can adopt web crawlers of the prior art on network, crawl the mode of network data.
Mode two, based on the mode of timing from web crawlers request msg, obtains the auxiliary data as searching object.
Wherein, in this kind of implementation, web crawlers to swash the auxiliary data of taking in as searching object at network according to the predetermined task that crawls, and the network data that buffer memory crawls, and then, the network data that data searcher timing crawls to web crawlers request, wherein, the mode that so-called web crawlers crawls auxiliary data on network can adopt the web crawlers of prior art to crawl the mode of network data.
Mode three, based on manually importing data mode, obtains the auxiliary data as searching object.
Wherein, in this kind of implementation, auxiliary data as searching object can be got by manual type, and then the data importing entrance provided by data searcher manually imports this assistant service.
It is emphasized that the mode of above-mentioned acquisition as the auxiliary data of searching object is as just example, should not form the restriction to the embodiment of the present invention.
S103, according to the current corresponding searching object of this first index database and this auxiliary data obtained, reconstructs this first index database;
After determining the first index database to be utilized, according to the current corresponding searching object of this first index database and this auxiliary data obtained, this first index database can be reconstructed.
It should be noted that, the mode reconstructing this first index database is identical with the mode built as the identical index database of two corresponding to the basic data of searching object, for example: the mode reconstructing this first index database can be all inverted list mode with the mode built as the identical index database of two corresponding to the basic data of searching object, is certainly not limited thereto.
S104, when obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Wherein, when obtaining data retrieval request, in order to respond this data retrieval request, can not be performed at least one index database of reconstructed operation from current, determining the second index database to be utilized; Wherein, this second index database is the current index database not being performed reconstructed operation, and to ensure in data retrieval process, another index database can be reconstructed operation based on obtained auxiliary data.
In one implementation, concrete, based on the first implementation of above-mentioned determination the first index database to be utilized, accordingly, describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, can comprise:
If the index database not being performed reconstructed operation current is two, and Stochastic choice index database is as the second index database to be utilized;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
In another kind of implementation, concrete, based on the second implementation of above-mentioned determination the first index database to be utilized, accordingly, describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, can comprise:
If the index database not being performed reconstructed operation current is two, judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, Stochastic choice index database is as the second index database to be utilized, otherwise, the auxiliary data obtained is reconstructed the index database of operation as the second index database to be utilized according to the last time;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Wherein, in this another kind of implementation, in order to ensure that index database is reconstructed in turn, if the index database not being performed reconstructed operation current is two, can judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, show that two index databases are identical, now, can Stochastic choice index database as the second index database to be utilized, otherwise, in order to ensure that index database is reconstructed in turn, the auxiliary data obtained can be reconstructed the index database of operation as the second index database to be utilized according to the last time.
S105, based on this second index database and corresponding searching object, determines the result for retrieval corresponding to this data retrieval request.
After determining the second index database to be utilized, based on this second index database and corresponding searching object, the result for retrieval corresponding to this data retrieval request can be determined.Certainly, after determining the result for retrieval corresponding to this data retrieval request, this result for retrieval can be exported, this result for retrieval can be known to make the person of sending of this data retrieval request.
Concrete, describedly determine the result for retrieval corresponding to this data retrieval request based on this second index database and corresponding searching object, can comprise:
From described second index database, determine whether there is the index information matched with term entrained by described data retrieval request, if existed, from the current corresponding searching object of described second index database, determine the result for retrieval corresponding to described data retrieval request.
Wherein, be understandable that, in data directory request, carry term, the index information that at least one and term match can be there is in the second index database, or there is not the index information matched with retrieving information, this is all rational.So-called term matches with index information and specifically refers to: partial content and the index information of term are identical, term and index information is identical and/or term is contained in index information, etc.Further, when all there is not the index information matched with term entrained by this data retrieval request in the second index database, now, then the result for retrieval corresponding to this data retrieval request can be defined as content for empty.For example: suppose that term is " notebook ", so, to mate for term is contained in the situation of index information with index information for term, the index information mated with this term can comprise: " notebook rank ", mate for the partial content of the term situation identical with index information for term and index information, the index information mated with this term can comprise: " cloud is taken down notes ", mate for the term situation identical with index information for term and index information, the index information mated with this term can comprise " notebook ".
Further, when the index information that existence one in the second index database matches with term, then, in the current corresponding searching object of this second index database, corresponding to this index information, result for retrieval is the result for retrieval corresponding to this data retrieval request; And when exist in the second index database at least two match with term index information time, then from the current corresponding searching object of this second index database, determine this Primary search result corresponding at least two index informations respectively, and after Primary search result is asked union, acquired results is defined as the result for retrieval corresponding to this data retrieval request.
It is emphasized that from the current corresponding searching object of described second index database, determine that the specific implementation of the result for retrieval corresponding to this data retrieval request can adopt prior art, do not repeat at this.
For example: data searcher is built with as the identical index database of two corresponding to the basic data of searching object: index database A and index database B, wherein, when first according to auxiliary data reconstruct, Stochastic choice index database is reconstructed operation, follow-up, in turn reconstruct is performed to index database A and index database B; And, when obtaining data retrieval request, if index database A and index database B is current is not all performed reconstructed operation, then judge whether index database A and index database B was not all reconstructed, if so, then Stochastic choice index database carrys out actual figure according to the retrieval, otherwise, using the index database of index database up-to-date for index information as data retrieval institute foundation, the auxiliary data being about to obtain according to the last time is reconstructed the index database of index database as data retrieval institute foundation of operation.
Compared with prior art, this programme builds two index databases in advance, and after acquisition auxiliary data, from two index databases, determine current at least one index database not being performed data retrieval, be not performed at least one index database of data retrieval from current, determine the first index database to be reconstructed, and according to the current corresponding searching object of the first index database and this auxiliary data obtained, reconstruct this first index database; Further, when obtaining data retrieval request, from two index databases, current at least one index database not being performed reconstructed operation is determined; Be not performed at least one index database of reconstructed operation from current, determine the second index database to be utilized; Based on this second index database and corresponding searching object, determine the result for retrieval corresponding to this data retrieval request, finally achieve and avoid affecting the object to the response of data retrieval request utilizing auxiliary data to upgrade in index database process.
Corresponding to said method embodiment, the embodiment of the present invention additionally provides a kind of data searcher, and as shown in Figure 2, this device can comprise:
Index database builds module 210, for building as the identical index database of two corresponding to the basic data of searching object;
Treat reconstruct index database determination module 220, for when obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
Index database reconstructed module 230, for according to the current corresponding searching object of described first index database and this auxiliary data obtained, reconstructs described first index database;
Described device also comprises:
Index database determination module 240 to be utilized, for when obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Indexed results determination module 250, for based on described second index database and corresponding searching object, determines the result for retrieval corresponding to described data retrieval request.
Compared with prior art, this programme builds two index databases in advance, and after acquisition auxiliary data, from two index databases, determine current at least one index database not being performed data retrieval, be not performed at least one index database of data retrieval from current, determine the first index database to be reconstructed, and according to the current corresponding searching object of the first index database and this auxiliary data obtained, reconstruct this first index database; Further, when obtaining data retrieval request, from two index databases, current at least one index database not being performed reconstructed operation is determined; Be not performed at least one index database of reconstructed operation from current, determine the second index database to be utilized; Based on this second index database and corresponding searching object, determine the result for retrieval corresponding to this data retrieval request, finally achieve and avoid affecting the object to the response of data retrieval request utilizing auxiliary data to upgrade in index database process.
In one implementation, treat described in that reconstruct index database determination module 220 can comprise:
First treats reconstruct index database determining unit, and for when obtaining the auxiliary data as searching object, the index database not being performed data retrieval if current is two, and Stochastic choice index database is as the first index database to be reconstructed; If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Described index database determination module 240 to be utilized can comprise:
First index database determining unit to be utilized, for when obtaining data retrieval request, the index database not being performed reconstructed operation if current is two, and Stochastic choice index database is as the second index database to be utilized; If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
In the second implementation, described in treat that reconstruct index database determination module 220 can comprise:
Second treats reconstruct index database determining unit, for when obtaining the auxiliary data as searching object, if the index database not being performed data retrieval current is two, the auxiliary data do not obtained according to the last time is reconstructed the index database of operation as the first index database to be reconstructed; If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Described index database determination module 240 to be utilized can comprise:
Second index database determining unit to be utilized, if be two for the current index database not being performed reconstructed operation when obtaining data retrieval request, judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, Stochastic choice index database is as the second index database to be utilized, otherwise, the auxiliary data obtained is reconstructed the index database of operation as the second index database to be utilized according to the last time; If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
Concrete, described indexed results determination module 250 can comprise:
Indexed results determining unit, for from described second index database, determine whether there is the index information matched with term entrained by described data retrieval request, if existed, from the current corresponding searching object of described second index database, determine the result for retrieval corresponding to described data retrieval request.
Concrete, described in wait to reconstruct the mode that index database determination module obtains as the auxiliary data of searching object and comprise:
The mode of reptile timing uploading data Network Based, obtains the auxiliary data as searching object;
Or,
Based on the mode of timing from web crawlers request msg, obtain the auxiliary data as searching object.
Concrete, described in wait to reconstruct the mode that index database determination module obtains as the auxiliary data of searching object and comprise:
Based on manually importing data mode, obtain the auxiliary data as searching object.
Concrete, the mode of described first index database of described index database reconstructed module reconstruct and described index database build module construction as searching object basic data corresponding to the mode of two identical index databases identical.
Concrete, the mode of described first index database of described index database reconstructed module reconstruct and described index database build module construction as searching object basic data corresponding to the mode of two identical index databases be: inverted list mode.
It should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment comprising described key element and also there is other identical element.
Each embodiment in this instructions all adopts relevant mode to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for device embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims (10)

1. a data retrieval method, is characterized in that, is applied to data searcher, and described method comprises:
Build as the identical index database of two corresponding to the basic data of searching object;
When obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
According to the current corresponding searching object of described first index database and this auxiliary data obtained, reconstruct described first index database;
Described method also comprises:
When obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Based on described second index database and corresponding searching object, determine the result for retrieval corresponding to described data retrieval request.
2. method according to claim 1, is characterized in that, is not describedly performed at least one index database of data retrieval from current, determines the first index database to be reconstructed, comprising:
If the index database not being performed data retrieval current is two, and Stochastic choice index database is as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, comprising:
If the index database not being performed reconstructed operation current is two, and Stochastic choice index database is as the second index database to be utilized;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
3. method according to claim 1, is characterized in that, is not describedly performed at least one index database of data retrieval from current, determines the first index database to be reconstructed, comprising:
If the index database not being performed data retrieval current is two, the auxiliary data do not obtained according to the last time is reconstructed the index database of operation as the first index database to be reconstructed;
If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Describedly not to be performed at least one index database of reconstructed operation from current, to determine the second index database to be utilized, comprising:
If the index database not being performed reconstructed operation current is two, judge whether current two index databases not being performed reconstructed operation were not all reconstructed, if, Stochastic choice index database is as the second index database to be utilized, otherwise, the auxiliary data obtained is reconstructed the index database of operation as the second index database to be utilized according to the last time;
If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
4. the method according to any one of claim 1-3, is characterized in that, described based on described second index database and corresponding searching object, determines the result for retrieval corresponding to described data retrieval request, comprising:
From described second index database, determine whether there is the index information matched with term entrained by described data retrieval request, if existed, from the current corresponding searching object of described second index database, determine the result for retrieval corresponding to described data retrieval request.
5. the method according to any one of claim 1-3, is characterized in that, the mode obtained as the auxiliary data of searching object comprises:
The mode of reptile timing uploading data Network Based, obtains the auxiliary data as searching object;
Or,
Based on the mode of timing from web crawlers request msg, obtain the auxiliary data as searching object.
6. the method according to any one of claim 1-3, is characterized in that, the mode obtained as the auxiliary data of searching object comprises:
Based on manually importing data mode, obtain the auxiliary data as searching object.
7. the method according to any one of claim 1-3, is characterized in that, the mode reconstructing described first index database is identical with the mode built as the identical index database of two corresponding to the basic data of searching object.
8. method according to claim 7, is characterized in that, the mode reconstructing described first index database is with the mode built as the identical index database of two corresponding to the basic data of searching object: inverted list mode.
9. a data searcher, is characterized in that, comprising:
Index database builds module, for building as the identical index database of two corresponding to the basic data of searching object;
Treat reconstruct index database determination module, for when obtaining the auxiliary data as searching object, not being performed at least one index database of data retrieval from current, determining the first index database to be reconstructed;
Index database reconstructed module, for according to the current corresponding searching object of described first index database and this auxiliary data obtained, reconstructs described first index database;
Described device also comprises:
Index database determination module to be utilized, for when obtaining data retrieval request, not being performed at least one index database of reconstructed operation from current, determining the second index database to be utilized;
Indexed results determination module, for based on described second index database and corresponding searching object, determines the result for retrieval corresponding to described data retrieval request.
10. device according to claim 9, is characterized in that, described in treat that reconstruct index database determination module comprises:
First treats reconstruct index database determining unit, and for when obtaining the auxiliary data as searching object, the index database not being performed data retrieval if current is two, and Stochastic choice index database is as the first index database to be reconstructed; If the index database not being performed data retrieval current is one, using the current index database not being performed data retrieval as the first index database to be reconstructed;
Described index database determination module to be utilized comprises:
First index database determining unit to be utilized, for when obtaining data retrieval request, the index database not being performed reconstructed operation if current is two, and Stochastic choice index database is as the second index database to be utilized; If the index database not being performed reconstructed operation current is one, using the current index database not being performed reconstructed operation as the second index database to be utilized.
CN201510857487.XA 2015-11-30 2015-11-30 Data retrieval method and device Active CN105468758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510857487.XA CN105468758B (en) 2015-11-30 2015-11-30 Data retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510857487.XA CN105468758B (en) 2015-11-30 2015-11-30 Data retrieval method and device

Publications (2)

Publication Number Publication Date
CN105468758A true CN105468758A (en) 2016-04-06
CN105468758B CN105468758B (en) 2019-08-09

Family

ID=55606458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510857487.XA Active CN105468758B (en) 2015-11-30 2015-11-30 Data retrieval method and device

Country Status (1)

Country Link
CN (1) CN105468758B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229358A (en) * 2017-12-22 2018-06-29 北京市商汤科技开发有限公司 Index establishing method and device, electronic equipment, computer storage media, program
CN108874983A (en) * 2018-06-12 2018-11-23 陕西师范大学 A kind of computerized data retrieval method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052284A1 (en) * 2006-08-05 2008-02-28 Terry Stokes System and Method for the Capture and Archival of Electronic Communications
CN101246500A (en) * 2008-03-27 2008-08-20 腾讯科技(深圳)有限公司 Retrieval system and method for implementing data fast indexing
CN101882142A (en) * 2009-05-08 2010-11-10 富士通株式会社 Index combining method and index combining device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052284A1 (en) * 2006-08-05 2008-02-28 Terry Stokes System and Method for the Capture and Archival of Electronic Communications
CN101246500A (en) * 2008-03-27 2008-08-20 腾讯科技(深圳)有限公司 Retrieval system and method for implementing data fast indexing
CN101882142A (en) * 2009-05-08 2010-11-10 富士通株式会社 Index combining method and index combining device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229358A (en) * 2017-12-22 2018-06-29 北京市商汤科技开发有限公司 Index establishing method and device, electronic equipment, computer storage media, program
CN108874983A (en) * 2018-06-12 2018-11-23 陕西师范大学 A kind of computerized data retrieval method

Also Published As

Publication number Publication date
CN105468758B (en) 2019-08-09

Similar Documents

Publication Publication Date Title
US20210374109A1 (en) Apparatus, systems, and methods for batch and realtime data processing
CN101640613B (en) Method and device for network resource relating management
CN103488759A (en) Method and device for searching application programs according to key words
US20140280070A1 (en) System and method for providing technology assisted data review with optimizing features
CN102023983B (en) Managing method of statistical space-time database
CN103136342B (en) The searching method of application A PP, system and search server
CN104965735A (en) Apparatus for generating upgrade SQL script
CN103678494A (en) Method and device for client side and server side data synchronization
US11226953B2 (en) Technique for generating a change cache database utilized to inspect changes made to a repository
CN104239377A (en) Platform-crossing data retrieval method and device
CN110866029B (en) sql statement construction method, device, server and readable storage medium
CN108563697B (en) Data processing method, device and storage medium
CN104090958A (en) Semantic information retrieval system and method based on domain ontology
CN108776678B (en) Index creation method and device based on mobile terminal NoSQL database
CN111813378B (en) Code base construction system, method and related device
Ceri et al. Data management for heterogeneous genomic datasets
CN108154024B (en) Data retrieval method and device and electronic equipment
CN105488165A (en) Data retrieval method and system based on index database
CN105468758A (en) Data retrieval method and device
CN115705313A (en) Data processing method, device, equipment and computer readable storage medium
CN110647673A (en) Method for realizing ecological environment space big data integration and sharing
CN115455006A (en) Data processing method, data processing device, electronic device, and storage medium
CN109684331A (en) A kind of object storage meta data management device and method based on Kudu
CN110472125B (en) Multistage page cascading crawling method and equipment based on web crawler
CN112150102A (en) Smart grid information system account updating method and device combining RPA and AI

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant