CN1770162A - Database management system, program and recording medium - Google Patents

Database management system, program and recording medium Download PDF

Info

Publication number
CN1770162A
CN1770162A CN 200510126849 CN200510126849A CN1770162A CN 1770162 A CN1770162 A CN 1770162A CN 200510126849 CN200510126849 CN 200510126849 CN 200510126849 A CN200510126849 A CN 200510126849A CN 1770162 A CN1770162 A CN 1770162A
Authority
CN
China
Prior art keywords
data
parts
text
search
full
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200510126849
Other languages
Chinese (zh)
Other versions
CN100495394C (en
Inventor
大瀬户太
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Publication of CN1770162A publication Critical patent/CN1770162A/en
Application granted granted Critical
Publication of CN100495394C publication Critical patent/CN100495394C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a database management system for managing a database, which comprises: a first data storing component for rapidly executing data search operation and slowly executing search of data alternation operation; a second data storing component, each is used for slowly executing data search operation and rapidly executing insertion and deletion of data alternation operation; a data transmission component transmitting data from each second data storing component to a first data storing component to reflecting result of insertion operation or deletion operation; a database operation request process component executing operation request to the database; a transaction processing component confirming accordance of data between the data transmission component and database operation request process component; and a document converting component for converting the first data storing component into the second data storing component between the operation request and asynchrony combination process of the database.

Description

The data base management system (DBMS) of management database
The application be that June 6, application number in 2003 are 03133014.2 the applying date, denomination of invention divides an application for the application for a patent for invention of " carry out to merge and handle and the full-text search device of registration/deletion processing ".
Technical field
The present invention relates generally to data base management system (DBMS), (full-text) searcher and full-text search method in full, relate in particular to a kind of the renewal and finish the data base management system (DBMS) that merges processing by delay, and the full-text search apparatus and method that are used for comprising from a plurality of document datas search the document of a specific character string.The present invention is applicable to the system of management large volume document data, such as document file management system, electronic library system and patent publication us searching system.
Background technology
In relational database, represent and management data with form.Described form is made up of the set of tuple (tuple), and wherein each tuple is a Column Properties value.And itself is stored in form in the document.
The operation of database is divided into following four kinds of operations.
(1) search (retrieval) operation
This provides the condition relevant with property value as search condition, to retrieve the operation of the tuple that is complementary with described condition.
(2) insert operation
This is the operation that the new tuple with given attribute value is inserted form.
(3) upgrade operation
This is the operation that the property value of the tuple of selecting is changed to new value from form.
(4) deletion action
This is the operation of deletion selected tuple from form.
Hereinafter, above-mentioned insertion operation, renewal operation and deletion action are called as alter operation together.
In the system that uses relational database, the response time of carrying out search operation is important performance index.
Therefore, in order to shorten the search response time, and a kind of method of utilizing index file opening relationships database has appearred.
This index file comprises the ad hoc structure that is transformed from one or more property values, with the state of quick estimation property value.
On the other hand, in alter operation, upgrading the required time of index file is a reason that influences execution performance.
In the common type of using index file, seldom need a just run search operation of alter operation owing to comparing with search operation; A large amount of alter operations are to carry out when system is out of service at night.Therefore, weigh performance by the response time of alter operation.
Yet in the request real-time characteristic, the response time of alter operation is just very important in online system.
In order to address the above problem, Japan is in disclosed " data base management system (DBMS) " of publication application No.10-143412, want being written in of database to be reflected to disk and be kept in the nonvolatile memory before temporarily, utilize nonvolatile memory to replace disk to consult corresponding data as disk cache.
Yet, because have only the data of simple structure to be stored in the disk cache, so there is the problem that can not use the high-performance index file.
In addition, Japan is in disclosed " data base management method and equipment and the machine-readable recording medium with recorded computer program that writes down its program " of publication application No.2000-163294, to the access of database in the second-level storage with upgrade to be to finish in the data buffer in primary memory, it is asynchronous that the page that upgrades is reflected to the processing procedure of database and application program, postpones the renewal processing thereby only finish with one group of data buffer; Reduced demand like this to main memory capacity.
Yet, can be kept in this data buffer because only have the data of simple structure, so with regard to there being the problem that can not use the high-performance index file, as Japan publication application No.10-143412.
When a plurality of users used Database Systems simultaneously, search operation was to ask in different with alter operation.In the meantime, use issued transaction to keep the consistance of data.At (1) " ' issued transaction principle (Principles of Transaction Processing) ' Philip A Bernstein, EricNewmarker work; Nikkei commercial affairs publishing company (Nikkei Business Publication, Inc) " in explained issued transaction in detail.
Fully independently handle the consistance that has guaranteed any moment data.Yet the reduction of executed in parallel will cause the reduction of bulk treatment ability.In order to address the above problem, used the notion of separate standards (isolation level).At (2) " ' ANSI SQL separate standards is commented (A Critique of ANSISQL Isolation Levels) ' Hal Bereson, Philip A Bernstein, Jim Gray, Jim Melton, Elizabeth J.O ' Neil work; Patrick E.O ' Neil Proc.ACM SIGMOD Conf. (Jun.1995) 1-10 page or leaf " in explained separate standards in detail.
In order to explain the problem in the above-mentioned conventional art, proposed the data of the full-text search that is used to be divided into a plurality of parts are preserved the idea that parts (inverted file (inverted file)) are arranged in union operation together.
When the quantity of the data in wanting merged inverted file reaches a boundary value, start union operation.This union operation has two types: the one, and synchronous merging, it finishes the identical such union operation of sequence of operations (foregrounding) of insertion operation with inverted file; Another is asynchronous merging, and it finishes operation (consistency operation) the such union operation different with inserting operation.
In asynchronous merging, insert in order during merging, correctly to carry out, must ad hoc handle and want merged inverted file.Therefore, during special disposal, union operation and insertion operation all are delayed processing, and the result makes that the response of inserting operation is slack-off.
In addition, recent years, along with the development of ICT (information and communication technology), electronic document and the information relevant with the document are by a large amount of issues such as internet.Therefore, proposed to be used for searching for quickly and accurately the document searching equipment of the document of wanting.
In this document searching equipment, keyword search methodology and full-text search method have been adopted.Use the full-text search equipment of full-text search method that arbitrary given search string is compared with the whole documents that will search for, to extract the whole documents that contain search string.Like this, different with keyword search methodology, the full-text search equipment of use full-text search method need not spend a large amount of manpowers provides all to want the keyword of searched document in advance.Proposed all kinds of full-text search equipment, wherein one type is to have adopted the oppositely equipment of the method for (index) file.In the method for inverted file, set up in advance and write down the document that contains characters/words/N character row (combination of n character) or write down the index file of its position that in document, occurs, as the auxiliary document that is used to search for; According to full-text search, search is by only utilizing inverted file to finish.Like this, the method for inverted file can realize search quite at a high speed, and this has good effect to the system that needs the high-speed search large volume document.
In addition, general full-text search method and inverted file method are at " information retrieval algorithm (Information Retrieval Algorithm) " (Kenji Kitasato, Kazuhiko Tsuda, MasamiShishibori work; Kyoritsu Shuppan Co., Ltd.; The 160-179 page or leaf), Japan describes in detail in publication application No.1l-073429 " description of related art " (Description of the Related Art) and the 1998 annual full-text search system meetings and activities reports (http://www.ftsanet.com/dbtokyo99/Db99.htm), and it also is known, just no longer explains here.
When adopting the routine techniques of inverted file method, Jap.P. No.3024544 (Japan is publication application No.9-265420), a kind of information retrieval device has been described, it is except that the search index file, also stored real-time processing data, handled so that when the search index file is being updated, also can finish search.In addition, a kind of document searching device and method has been described by Japan publication application No.7-146880, and it registers a new document in the subindex littler than master index, shorten hour of log-on thus.
Yet, comprising that above-mentioned Japan in the inverted file method described in the publication application, needs to set up the inverted file bigger several times than raw data usually.Therefore, along with the growth of the quantity of register documents data, the full-text index of inverted file method needs longer being used to register/delete the time of processing.Therefore, from user's angle, in such full-text search device, the response time of registering/deleting processing is elongated.
Summary of the invention
Primary and foremost purpose of the present invention is that a kind of improved and useful data base management system (DBMS), full-text search device and full-text search method that has overcome the problems referred to above is provided.
Another specific purpose of the present invention is, a kind of data base management system (DBMS) is provided, it can avoid inserting the deterioration of the response of operation when carrying out asynchronous the merging, and raising is to the overall response of a large amount of searching request, and provide full-text search device and full-text search method, its angle from the user sees and can shorten the response time that registration and deletion are handled, can also eliminate one handle can not the term of execution time.
In order to finish above-mentioned target, according to an aspect of the present invention, provide a kind of data base management system (DBMS) of management database, described system comprises, first data are preserved parts, and it is used for carrying out fast the search operation of data and the search that low speed is carried out the alter operation of data; Second data are preserved parts, insertion and deletion that each all is used for the search operation of the data carried out at a slow speed and carries out the alter operation of data fast; Data transfer unit, it is preserved parts with data from each second data and is sent to first data preservation parts, so that the result of operation or deletion action is inserted in reflection; Database manipulation Request Processing parts, it carries out the operation requests to database; Transaction component, it has guaranteed the consistance of the data between data transfer unit and the database manipulation Request Processing parts; And document converting member, it is preserved first data parts and is converted to second data and preserves parts between the operation requests of database and asynchronous merging are handled, when preserving parts and be used for asynchronous merging and handle with one second data of box lunch, another second data are preserved parts and are used for operation requests to database.
According to the present invention, need and not insert between the operation and operate separately at merging process, so just avoided inserting the deterioration of the response of operation, and reduced the total amount of response searching request on the whole.
Other purpose of the present invention, characteristic and advantage are more apparent in the detailed description in conjunction with corresponding accompanying drawing below.
Description of drawings
Fig. 1 is the block scheme of database of descriptions management system structure.
Fig. 2 is the structured flowchart that embodies the data base management system (DBMS) example.
Fig. 3 is the process flow diagram of an example of the processing procedure of expression database manipulation Request Processing parts.
Fig. 4 is the process flow diagram of an example of the processing procedure of expression data transfer component.
Fig. 5 is the block scheme of expression according to the structure of the data base management system (DBMS) of first embodiment of the invention.
Fig. 6 is the block diagram of explanation according to the operation of the file conversion parts of first embodiment of the invention.
Fig. 7 is the block scheme of expression according to the structure of the data base management system (DBMS) of second embodiment of the invention.
Fig. 8 is the block diagram of explanation according to the operation of the file conversion parts of second embodiment of the invention.
Fig. 9 is the block diagram of explanation according to the operation of the file conversion parts of third embodiment of the invention.
Figure 10 is the block scheme of expression according to the structure of the data base management system (DBMS) of fourth embodiment of the invention.
Figure 11 is in database manipulation Request Processing parts the process flow diagram of the processing procedure of searching request of expression according to fourth embodiment of the invention.
Figure 12 is the process flow diagram of explanation according to the processing procedure of the watchdog timer of fourth embodiment of the invention.
Figure 13 is the process flow diagram of expression according to the processing procedure of the processing of the merging in the data transfer component of fourth embodiment of the invention.
Figure 14 is the block scheme of the full-text search device of explanation sixth embodiment of the invention.
Figure 15 is the block diagram of hardware configuration of the unit example of expression full-text search device shown in Figure 14.
Figure 16 is the block diagram of hardware configuration of the server/client schema instance of expression full-text search device shown in Figure 14.
Figure 17 is the block diagram of the processing of explanation full-text search device shown in Figure 14, and shows the example of a full-text index.
Figure 18 is that explanation illustrates the reverse list that is labeled as " in full " in full-text index shown in Figure 17 the figure that merges processing as an example.
Figure 19 is the block scheme of explanation according to the full-text search device of eighth embodiment of the invention.
Figure 20 is the first pass figure that handles example in the explanation full-text search device shown in Figure 19.
Figure 21 is second process flow diagram of handling example in the explanation full-text search device shown in Figure 19.
Figure 22 is the 3rd process flow diagram of handling example in the explanation full-text search device shown in Figure 19.
Figure 23 is the block scheme of explanation according to the full-text search device of ninth embodiment of the invention.
Figure 24 is the first pass figure of the processing example of explanation full-text search device shown in Figure 23.
Figure 25 is second process flow diagram of the processing example of explanation full-text search device shown in Figure 23.
Figure 26 is the 3rd process flow diagram of the processing example of explanation full-text search device shown in Figure 23.
Figure 27 is the block scheme of explanation according to the full-text search device of tenth embodiment of the invention.
Figure 28 is the first pass figure of the processing example of explanation full-text search device shown in Figure 27.
Figure 29 is second process flow diagram of the processing example of explanation full-text search device shown in Figure 27.
Figure 30 is the 3rd process flow diagram of the processing example of explanation full-text search device shown in Figure 27.
Figure 31 is the block scheme of explanation according to the full-text search device of eleventh embodiment of the invention.
Figure 32 is the first pass figure of the processing example of explanation full-text search device shown in Figure 31.
Figure 33 is second process flow diagram of the processing example of explanation full-text search device shown in Figure 31.
Figure 34 is the 4th process flow diagram of the processing example of explanation full-text search device shown in Figure 31.
Embodiment
Below with reference to accompanying drawings, describe embodiments of the invention in detail.
<system architecture 〉
Fig. 1 is the block scheme of database of descriptions management system structure.To make detailed description with relational database as database below, but the present invention also is suitable for other database.
Data base management system (DBMS) shown in Figure 1 comprises that database manipulation request input block 1, transaction component 2, database manipulation Request Processing parts 3, first data are preserved parts 4, second data are preserved parts 5, and data transfer component 6.
From the operation requests of database manipulation request input block 1 input to database, and by database manipulation Request Processing parts 3 processing operation requests.
First data are preserved search (retrieval) operation that parts 4 are used for database, and the alter operation of data transmission.
Second data are preserved parts 5 and are used for insertion operation, the deletion action of database and upgrade operation, and the deletion action when data transmission.
First data preserve parts 4 and second data preserve handle in the parts 5 to as if index file, for example have relational database, these index files are preserved for this purpose of reference data each other relatedly.
Data transfer component 6 reads tuple from second data preservation parts 5, and the operation of inserting tuple in the parts 4, upgrading the tuple in first data preservation parts 4 and delete tuple from first data preservation parts 4 is preserved in execution to first data.
Transaction component 2 is carried out special-purpose control, the order that is used to control not the execution of the search operation of request simultaneously and alter operation request, to keep the conforming of data, and write down recorded information, described recorded information is used for returning to form before starting an issued transaction under the situation of cancellation alter operation.
Hereinafter, will further describe above-mentioned parts in detail.
" database manipulation request input block 1 "
Fig. 2 is the block diagram of example that has embodied the structure of data base management system (DBMS).Come the function of fulfillment database operation requests input block 1 with entry terminal 25.Have database, CPU21 in the server host 20, comprise the storer 22 of program area 22a and data area 22b, and hard disk 23, they are connected to each other by data bus 24.A plurality of entry terminals 25 are by LAN26 server host 26 connections therewith.Database manipulation request input block 1 in this example can make a plurality of user input database operation requests.
The user represents the character string of database manipulation request from such as entry terminal 25 with the input of SQL statement form.The database manipulation request is sent to server host 20 by LAN26, and handles in server host 20.The result who handles sends entry terminal 25 to by LAN26 again, and passes to the user in the mode that shows on the display of entry terminal 25.
" database manipulation Request Processing parts 3 "
Fig. 3 is the process flow diagram of an example of the processing procedure of expression database manipulation Request Processing parts 3.
Type according to the database manipulation request is divided the content of handling.In the present embodiment, for each tuple in the table provides unique ID (tuple ID), each tuple can be by tuple ID identification like this.
When database manipulation Request Processing parts 3 are received the database manipulation request (step S1), whether database manipulation Request Processing parts 3 judgment data storehouse operation requests are to insert operation (step S2).When the database manipulation request is when inserting operation, database manipulation Request Processing parts 3 obtain the tuple ID (step S3) of new tuple, and after this, carry out tuple is inserted the insertion operation (step S4) that second data that are used to insert are preserved parts 5, and its result is returned (step S5).
When database manipulation request among the step S2 is not when inserting operation, 3 pairs of database manipulation Request Processing parts are used for that first data of search (retrieval) are preserved parts 4, second data that are used to insert are preserved parts 5 and second data that are used to delete are preserved each execution search operations (step S6-S8) of parts 5, form final search result set R (R=Rr+Ri-Rd by each search result set Rr, Ri and Rd then; Wherein+presentation logic " or " ,-presentation logic " non-") as a result of (step S9).
Then, when the database manipulation request was search operation, database manipulation Request Processing parts 3 returned its result (step S5) in step S10.When the database manipulation request was not search operation, whether database manipulation Request Processing parts 3 judgment data storehouse operation requests were deletion action (step S11) in step S10.When the database manipulation request is not deletion action, database manipulation Request Processing parts 3 utilize the tuple change that provides selected in the process similar to search operation and the tuple of a new value of coming, second data that are used to insert are preserved the renewal operation (step S15) that parts 5 are carried out tuple, and return its result (step S5).
When the database manipulation request is deletion action in step S11, and when tuple selected in the process similar to search operation is present among the Ri (being among the step S12), second data that 3 pairs of database manipulation Request Processing parts are used to insert are preserved parts 5 and are carried out deletion action (step S13), and return its result (step S5).When tuple is not in Ri, be tuple in Rr time the (among the step S12 not), database manipulation Request Processing parts 3 are carried out tuple are inserted the insertion operation (step S14) that second data that are used to delete are preserved parts 5, return its result (step S5), and the end process process.
" first data are preserved parts 4 "
It is that the data that are used to search for are preserved parts that first data are preserved parts 4, and it is used for the alter operation that database manipulation Request Processing parts 3 are carried out search operation and data transfer component 6 execution.
First data are preserved parts 4 can carry out search operation at a high speed, but speed is just low relatively when carrying out alter operation.For example, the full-text search server that can carry out full-text search just can be preserved parts as first data.
" second data are preserved parts 5 "
Here have and be respectively applied for second data preservation parts 5 that insert and delete two types.
Second data that are used to insert are preserved parts 5 and are used for insertion operation, the deletion action of being carried out by database manipulation Request Processing parts 3 and upgrade operation, and also can be used for the deletion action that data transfer component 6 is carried out.
Second data preservation parts 5 that are used to delete are used for the deletion action by 3 execution of database manipulation Request Processing parts, and also can be used for the deletion action by data transfer component 6 execution.
Second data are preserved parts 5 can carry out insertion operation and alter operation at a high speed.For example, the ordinary file by the OS management can be used as second data preservation parts 5.
Preserving parts 5 when second data can not be when ordinary file be carried out search operation like that, and second data are preserved parts 5 and returned the tuple that it is preserved continuously, database manipulation Request Processing parts 3 assessment search conditions (it is called as whole search).
Compare because preserve the quantity of the tuple that parts 4 are preserved with first data, second data are preserved parts and have been preserved the very tuple of smallest number, do not influence the response time so carry out whole search.
" data transfer component 6 "
Fig. 4 is the process flow diagram of an example of the processing procedure of an expression data transfer component 6.
In the data transmission that data transfer component 6 is carried out, tuple is preserved parts 5 from second data that are used for inserting and is read one by one, and preserves parts 4 insertions or upgrade tuple to first data according to the value of tuple.Then, preserve the tuple that parts 5 deletions are reflected from second data that are used for inserting.
Next, tuple is preserved parts 5 from second data that are used to delete and is read one by one, and preserves these tuples of deletion the parts 4 from first data, and the tuple that is reflected is preserved parts 5 deletions from second data that are used for deleting then.
In Fig. 4, as data transfer component 6 (step S21) when second data that are used for inserting are preserved parts 5 and obtained each tuple, data transfer component 6 preserves in first data that search has the tuple (step S22) of identical tuple ID in the parts 4.When having identical tuple (being among the step S23), data transfer component 6 is carried out the renewal (step S24) that first data are preserved the tuple in the parts 4, the tuple (step S25) that second data that deletion simultaneously is used for inserting are preserved parts 5.
When not having identical tuple (among the step S23 not), data transfer component 6 is carried out to first data and is preserved the operation (step S29) of inserting tuple in the parts 4, preserves parts 5 deletion these tuples (step S25) from second data that are used for inserting simultaneously.
After the completing steps 25, data transfer component 6 is preserved parts 5 from second data that are used for deleting and is obtained each tuple (step S26), execution is preserved the operation (step S27) of deleting this tuple the parts 4 from first data, from the second data preservation portion, 5 deletion this tuple (step S28), the end process processes simultaneously that are used for deleting.
" transaction component 2 "
Transaction component 2 is carried out special-purpose control, the execution order that is used to control not the search operation of request simultaneously and alter operation is to keep the consistance of data, and write down recorded information, described recorded information is used for returning to form before starting an issued transaction under the situation of cancellation alter operation.
The special-purpose special use control of controlling between the database manipulation that not only is used as a plurality of users' requests also can be as the control of the special use between database manipulation Request Processing parts 3 and the data transfer component 6.
Many granularities (granularity) of explanation lock can be realized different separate standards (isolation level) in the document of being mentioned above utilizing (1).Wanting blocked target is form, tuple and file, and file preserves parts 4 by first data that are used to insert and delete and second data preservation parts 5 are formed.
In special use control and login, first data preservation parts 4 of composing document and second data preservation parts 5 can be used as a target and handle together, and do not consider its structure.
To describe the realization affirmation below and read the method for the separate standards of (READ COMMITTED).It at first is following pre-service about lock.
The lock of-form is following arbitrary pattern.
(table 1)
Latching mode Describe
S (sharing) Be used for read-only processing, for example select SELECT
X (special use) Being used for data change handles, for example upgrade UPDATE, insert INSERT or deletion DELETE, to avoid upgrading same resource simultaneously
IS (intention is shared) Expression is useful on to read the issued transaction of S-lock resource than low level
IX (intention is special-purpose) Expression is useful on to read the issued transaction of X-lock resource than low level
SIX (sharing) with intention is special-purpose Expression has the issued transaction that will change IX-lock resource with lower grade, allows with reading a resource simultaneously
The lock of-tuple is S pattern or X pattern.
The S-lock of-tuple can be cancelled in any moment after quoting.
The X-lock of-tuple only just can be cancelled when affairs finish.
With afterwards, a special-purpose latch is applied to described file before the-access file.
The process that action type according to database realizes will be described below.
(1) search operation
0. before starting an operation, on the target form, add the IS-lock.
1. latch a file.
2. search for tuple with search condition coupling.
3. on the tuple that obtains, add the S-lock.
4. in the time can not adding the S-lock, cancel latching of file, and return 1.
5. in the time can adding the S-lock, add tuple to Search Results.
6. the locking of cancellation tuple.
7. the cancellation file latchs.
(2) insert operation
0. before starting an operation, on the target form, add the IX-lock.
1. latch a file.
2. obtain a new tuple ID.
3. on tuple, add the X-lock.
4. the record tuple is inserted operation in daily record.
5. tuple is inserted file.
6. the cancellation file latchs.
(3) upgrade operation
0. before starting an operation, on the target form, add the IX-lock.
1. latch a file.
2. search for the tuple of the condition coupling of one and one tuple that will upgrade.
3. on the tuple that obtains, add the X-lock.
4. in the time can not adding the X-lock, cancel latching of file, and return 1.
5. in the time can adding the X-lock, the record tuple is upgraded operation in daily record.
6. carry out the renewal of tuple.
7. the cancellation file latchs.
(4) deletion action
0. before starting an operation, on the target form, add the IX-lock.
1. latch a file.
2. search for the tuple of mating with the condition of the tuple that will delete.
3. on the tuple that obtains, add the X-lock.
4. in the time can not adding the X-lock, cancel latching of file, and return 1.
5. in the time can adding the X-lock, in daily record, write down the tuple deletion action.
6. carry out the deletion of tuple.
7. the cancellation file latchs.
When end transaction is handled, carry out following processing.
(1) entrusts (commitment)
When on the target form, adding the IX-lock before alter operation of startup, carry out and entrust:
1. operation entrusted in record in daily record.
2. cancel all locks by the transaction request of being entrusted.
(2) rerun (rollback) (abort)
Carrying out one when adding the IX-lock before alter operation of startup on the target form reruns:
1. record reruns operation in daily record.
2. cancel a data manipulation according to daily record.In this process, latch the file that needs.
3. by reruning transaction request, cancel all locks.
The basic operation that merges will be described below.In merging, demons three files below the backstage utilizes merge to a large-scale inverted file with a small-sized inverted file.
-large-scale inverted file: the file destination of merging
-small-sized the inverted file that is used to insert: the merged file that is used to insert
-small-sized the inverted file that is used to delete: the merged file that is used to delete
Below, will the operation of demons be described summarily.
These demons have three threads: the timer thread that main thread, the daemon thread of carrying out merging, the continuous management of using the thread identical with executive routine to operate merges.
In the basic operation of main thread, main thread judges whether the size of small-sized inverted file has exceeded the predetermined value that is inserted into small-sized inverted file.When file size exceeded this predetermined value, main thread sent the file ID of the large-scale inverted file of target and two small-sized inverted files to daemon thread, and the execution of guiding merging, thereby carried out merging according to size.
In the basic operation of timer thread, this timer thread was checked regularly in the constant time interval, and had judged whether to pass through the time of appointment.When having passed through the time of appointment, the timer thread sends the file ID of the large-scale inverted file of target and two small-sized inverted files to the daemon thread relevant with all large-scale inverted files, and the execution that merges of guiding, thereby carry out merging according to this constant time.
According to the indication that the execution of receiving merges, daemon thread is finished the merging of inverted file by carrying out a series of following operation.
1. start the issued transaction of acquiescence.
2. metadatabase is done as a whole locking (Action Target: tuple, the content of operation: read).
3. (Action Target: tuple, content of operation: read) locks on the tables of data of metadatabase
4. on the tuple corresponding, lock (Action Target: tuple, content of operation: read) with the target database in the database table of metadatabase.
5. (Action Target: tuple, content of operation: read and write) locks on database.
The table on chains (Action Target: tuple, content of operation: read and write).
7. inverted file is changed to merging phase.
8. merge the anti-tabulation of the small-sized inverted file of a plurality of insertions.
9. empty the small-sized inverted file of merging.
10. cancel the state of the inverted file that has merged.
11. end process.
Next, based on above-mentioned data base management system (DBMS), with the data base management system (DBMS) of describing according to first embodiment of the invention.
embodiment 1 〉
Fig. 5 is the block scheme of an expression according to the structure of the data base management system (DBMS) of first embodiment of the invention.
Among first embodiment with Fig. 1 in parts identical or that be equal to represent with identical Reference numeral, and only describe its difference in detail here.
As shown in Figure 5, data base management system (DBMS) comprises that database manipulation request input block 1, transaction component 2, database manipulation Request Processing parts 3, first data are preserved parts 4, second data are preserved parts 5, data transfer component 6 and file conversion parts 7.
At first, have a plurality of second data and preserve parts 5 (be used to insert and delete inverted file).
File conversion parts 7 judge that second data preservation parts 5 are used for the operation of database or being used for merging handles.When second data preservation parts 5 were used for database manipulation, file conversion parts 7 sent the database manipulation request to be used for database manipulation second data and preserve parts 5.When second database preservation parts 5 are used for union operation, file conversion parts 7 will merge the request of processing and send first data preservation parts 4 to when merging processing.
In Fig. 5, the data that first data preservation parts 4 are equivalent to be used to search for are preserved parts, and the data that second data preservation parts 5 are equivalent to be used to insert and delete are preserved parts.
Fig. 6 is the block diagram of the operation of the described file conversion parts 7 of explanation.
At first have two group of second data and preserve parts 5 (being used for inserting and being used for deletion).
The 7 pairs of two groups of data of file conversion parts are preserved parts 5 and are changed when preserving parts 5 and be used for database manipulation with convenient one group of second data other one group of second data and preserve parts 5 and be used for union operation.
File conversion parts 7 are changed described two group of second data and are preserved operation that parts 5 the mention purposes in " 7. inverted file being changed to merging phase " in the above in the operation of the daemon thread of merge handling.Particularly, file conversion parts 7 are preserved parts (large-scale inverted file) 4 with first data and are changed to merging phase, second data preservation parts 5 that will be used for database manipulation are converted to second database preservation parts that are used to merge processing, and second data that will be used to merge processing preserve parts 5 and are converted to second data that are used for database manipulation and preserve parts, only second data that are used to merge processing are preserved parts (small-sized inverted file) 5 afterwards and are changed to merging phase.
Preserving parts 5 about second data that are used to merge processing carries out operation above-mentioned and " 8. merges the anti-tabulation of great quantity of small inverted file " and reach " 9. empty this merging after small-sized inverted file ".
embodiment 2 〉
In the first above-mentioned embodiment, when a large amount of insertion operations takes place simultaneously according to the database manipulation request when, insert operation and will surpass second data that are used for database manipulation in the data total amount and continue to carry out after preserving the capacity of parts 5, thereby brought the danger of performance reduction.
In according to a second embodiment of the present invention, provide three or more second data to preserve parts, when the quantity that the second data data that are used for database manipulation with box lunch are preserved the data of parts 5 surpassed given boundary value, another second data of not using were preserved parts 5 and are converted to second data that are used for database manipulation and preserve parts 5.
Fig. 7 is the block scheme of an expression according to the structure of the data base management system (DBMS) of second embodiment of the invention.
Among second embodiment with Fig. 1 and aforesaid first embodiment in parts identical or that be equal to represent with identical Reference numeral, and only describe its difference in detail here.
As shown in Figure 7, data base management system (DBMS) comprises that database manipulation request input block 1, transaction component 2, database manipulation Request Processing parts 3, first data are preserved parts 4, second data are preserved parts (being used for inserting and being used for deletion) 5, data transfer component 6, file conversion parts 7 and data volume decision means 8.
Fig. 8 is the block diagram of an explanation according to the operation of the file conversion parts 7 of second embodiment of the invention.
At first, second data with three or more are preserved parts (being used for inserting and being used for deletion) 5.In addition, prepare to preserve parts 5 accordingly with reference to formation 60, so that second data preserve parts 5 and each is with reference to formation 60 mutual references with a plurality of second data that are equipped with.In addition, second data that are used for database manipulation are preserved the head position that parts 5 point to reference to formation 60, and remaining second data preservation parts is used to act on and merged second data preservation parts 5 that second data of handling are preserved parts 5 and sky.
Data volume decision means 8 monitors the capacity of second data preservation parts 5 that are used for database manipulation.When capacity equaled or exceeded predetermined boundary value, 7 requests of data volume decision means 8 circular document converting members were used for second data of the sky of database manipulation and preserve parts 5.
When the second empty data of data volume decision means 8 requests are preserved parts 5, second data that second data that file conversion parts 7 will be used for merging processing are preserved the sky of parts 5 are preserved parts 5 and are forwarded head position with reference to formation 60 to, and second data preservation parts 5 that are used for database manipulation that before were positioned at the head position are forwarded to reference to the second place of formation 60 or other position thereafter.Second data preservation parts 5 that are used for database manipulation and are used to merge processing like this, have just been changed.
File conversion parts 7 are changed described second data and are preserved operation that parts 5 the mention purposes in " 7. inverted file being changed to merging phase " in the above in the operation of the daemon thread of merge handling.Especially, when the quantity of the data in second data preservation parts 5 surpasses this given boundary value, file conversion parts 7 rearrange behaviour and examine formation 60, second data preservation parts 5 that will be used for database manipulation are converted to second database preservation parts that are used to merge processing, and second data that will be used to merge processing preserve parts 5 and are converted to second data that are used for database manipulation and preserve parts, and second data that only will be used to merge processing are afterwards preserved parts 5 and are changed to merging phase.
Preserve parts 5 according to second data, carry out operation above-mentioned and " 8. merge the anti-tabulation of great quantity of small inverted file " and reach " 9. empty this merging after small-sized inverted file ".
embodiment 3 〉
In aforementioned second embodiment, the maximum quantity that the second required data are preserved parts 5 will pre-determine, and therefore, except be carved with lot of data storehouse operation requests at certain, it is useless that part second data are preserved parts 5, and this has caused the waste of resource.
In a third embodiment in accordance with the invention, provide second data to preserve parts as required dynamically.
Structure according to the data base management system (DBMS) of third embodiment of the invention is identical with the structure of the data base management system (DBMS) of foregoing second embodiment, only describes its difference here in detail.
Fig. 9 is the block diagram of explanation according to the operation of the file conversion parts 7 of third embodiment of the invention.
Data base management system (DBMS) according to third embodiment of the invention comprises, second data that are used to register current use are preserved the reference formation 70 of parts 5, and has the reference queue count 80 that second data of a plurality of registrations in reference formation 70 are preserved parts 5.
Second data that are used for database manipulation are preserved parts 5 and are positioned at head position with reference to formation 70, and remaining second data is preserved parts and merged second data of handling and preserve parts 5 with acting on.
When the second empty data of data volume decision means 8 requests are preserved parts 5, file conversion parts 7 dynamically produce second new data and preserve parts, and second data that this is new preservation parts 5 are arranged on the head position with reference to formation 70, second data that are used for database manipulation that before were positioned at the head position are preserved parts 5 forward to, will add one with reference to queue count 80 with reference to the second place of formation 70 or other position thereafter.Second data preservation parts 5 that are used for database manipulation and are used to merge processing like this, have just been changed.
File conversion parts 7 with foregoing second embodiment in identical opportunity changed the purposes that second data are preserved parts 5.But in the operation above-mentioned " 9. empty merging after small-sized inverted file ", deletion was once carried out second data of union operation and was preserved parts 5 and its corresponding with reference to formation 70, subtracted one with reference to queue count 80 simultaneously.
embodiment 4 〉
In data base management system (DBMS), merge to handle and in indexing units, carry out, can carry out search like this or even during merging.
When search, use first data preservation parts 4 and be useful on second data that merge processing to preserve parts 5, thereby force before the merging processing and afterwards an indexing units is carried out special use control.
In addition, when handling searching request, be kept at second data that are used for merging processing and preserve the execution that the state of the data of parts 5 is unfavorable for searching for processing; Therefore, it is more favourable to search to carry out state that merge to handle.
But, during merge handling, during a large amount of searching request of input,, merge to handle and just can not carry out owing to search is being carried out special-purpose control, the data that are used to merge processing just constantly maintenance be unfavorable for searching for the state of processing execution.Like this, just the slowed down execution of whole search.
In a fourth embodiment in accordance with the invention, search processing and the special use control that merges between handling utilize the dedicated mechanism of transaction component 2 to carry out.Accordingly, even when importing a large amount of searching request, merging is handled and also can be finished at a high speed, so that improve the whole responses to searching request on the whole.
Figure 10 is the block scheme of expression according to the structure of the data base management system (DBMS) of fourth embodiment of the invention.The data base management system (DBMS) of fourth embodiment of the invention is based on the data base management system (DBMS) of first embodiment of front.Alternative, the data base management system (DBMS) of fourth embodiment of the invention also can be based on the data base management system (DBMS) of the second or the 3rd embodiment of front.
Among the 4th embodiment with Fig. 1 and aforesaid first embodiment in parts identical or that be equal to represent with identical Reference numeral, and only describe its difference in detail here.
As shown in figure 10, data base management system (DBMS) comprises that database manipulation request input block 1, transaction component 2, database manipulation Request Processing parts 3, first data are preserved parts 4, second data are preserved parts 5, data transfer component 6, file conversion parts 7 and watchdog timer 9.
Hereinafter, with reference to the operation of the flow chart description fourth embodiment of the invention of Figure 11 to Figure 13.
(1) search is handled
Figure 11 is the process flow diagram that is illustrated in the processing procedure of searching request in the database manipulation Request Processing parts 3.
At first, when starting the search processing, the database manipulation Request Processing parts 3 that receive searching request are provided with ductile maximum time (step S31) in the search in watchdog timer 9.
Afterwards, the dedicated mechanism of database manipulation Request Processing parts 3 request transaction processing element 2 is used for the search special-purpose control of processing execution (step S32).
But maximum is set after time delay in watchdog timer 9, then carries out the special use control that is used to search for processing, search is handled and just can have been carried out like this.
Thus, database manipulation Request Processing parts 3 empty watchdog timer 9 (step S33), and make (step S34) carried out in the search of database.
When finishing search, the special use control that the 3 request dedicated mechanism cancellation search of database manipulation Request Processing parts are handled, and finish this process (step S35).
Figure 12 is the process flow diagram of the processing procedure of expression watchdog timer 9.
Watchdog timer 9 is ready up to database manipulation Request Processing parts 3 and to timer maximum delay time (step S41) is set.
When but maximum during time delay (being among the step S41) is set to timer, watchdog timer 9 dormancy when countdown, but up to arrive maximum time delay (step S42, S43).
When but timer reaches maximum during time delay (being among the step S43), watchdog timer 9 requests merge handles the special use control that release is handled search, but and enters armed state up to maximum delay time (step S44 and S41) is set to next timer.
(2) merge processing
Figure 13 is the process flow diagram of the processing procedure of the merging processing in the expression data transfer component 6.
In merging processing, the dedicated mechanism of data transfer component 6 request transaction processing element 2 is used to be combined the special-purpose control of processing execution, so that forbid in the searching request (step S51) that merges during handling temporarily.
After the special use control that is used to merge processing was finished, data transfer component detected watchdog timer 9 and whether asks to discharge locking (special-purpose control) (step S52).When request does not discharge this locking (among the step S52 not), data transmission unit 6 is the anti-tabulation of the unit of merging processing from one of the extraction of second data preservation parts 5 (being used for inserting and being used for deleting) that is used for merging, and is that an indexing units is preserved parts 4 execution merging processing (step S53) to first data.
Repeating step S52 and step S53 are up to anti-end of list (EOL) (being among the step S54).
When anti-tabulation (indexing units) finishes (among the step S54 not), merge to handle also and finish, data transfer component 6 request dedicated mechanism are used to merge the special use control of processing with cancellation, and finish this process (step S55).
On the other hand, at step S52, when request discharges lock (step S52 is), the request of the finishing cancellation that data transmission unit 6 is handled according to the merging of the anti-tabulation that is used for an indexing units is used to merge the special use control (step S56) of processing.
Data transmission unit 6 temporary transient interruptions merge to be handled, and is that the search processing discharges CPU (step S57).Data transfer component 6 turns back to step S51 then, so that ask to be used to merge the special use control of processing once more.
Thus, when discharging CPU, carry out search, do not carry out the merging of anti-tabulation during this and handle.Search is handled and can be carried out safely like this.
embodiment 5 〉
Further, the present invention is not limited to above embodiment.Purpose of the present invention, also can utilize according to the program function of the parts of the data base management system (DBMS) of each previous embodiment and realize, it is write-in program in such as the recording medium of CD-ROM in advance, this CD-ROM is installed in the computing machine that has such as the media drive of CD-ROM drive, these programs of storage in the storer of computing machine or memory storage, and carry out this program.
The program of reading from recording medium in this case, has realized the functions of components according to each previous embodiment.Program and stored program recording medium and be also included within the scope of the present invention like this.
In addition, recording medium can be the medium (for example, ROM, Nonvolatile memory card) such as semiconductor medium, optical medium (for example, DVD, MO, MD, CD-R), and magnetic medium (for example, tape, floppy disk).
According to the functions of components of each previous embodiment is not only to realize by carrying out the program that loads, and it also can be with realizations such as operating systems, and it is according to the instruction operating part of described program or whole actual treatment.
In addition, the program mentioned can be stored in the memory storage of server computer disk for example, and be distributed to the subscriber computer that connects by the communication network such as the Internet by means such as downloads above.In this case, the memory storage of server computer also is included in the recording medium of the present invention.
embodiment 6 〉
In addition, full-text search device according to sixth embodiment of the invention can be a preset time with merging the required time restriction of processing, even so that when register documents quantity increases, the stand-by period of other processing also is defined as a preset time, has improved the throughput of system so on the whole.Registration (insertions) and the response time of deleting, shortened in the full-text search device of in the patented claim that the application's applicant has submitted to, describing, and in this full-text search device, further shortened.
The small-sized full-text index that the full-text search device of describing in the patented claim that the application's applicant has submitted to has the registration of being respectively applied for and is used to delete has so just been avoided registration and the deterioration of response time of deleting.When search is handled, the full-text search device adds the Search Results of the small-sized full-text index that is used to register in the Search Results of a large-scale full-text index, and, from then on form the Search Results that returns to the user from wherein removing the Search Results of the small-sized full-text search index that is used to delete.The application (data base management system (DBMS)) that Here it is improves the method for the throughput of registering and deleting, it has utilized the data that are used to register and delete to preserve parts, the described data that are used for registering and delete are preserved parts and are not included in the data holding member that is used to search for, this also is described in another patented claim that the application's applicant has submitted to, in the full-text search device, use the method for the full-text index of inverted file, so that shortened the response time of registration and deletion from user's angle.Promptly, in the full-text search device described in the patented claim that the application's applicant has submitted to, data transfer component is utilized by the anti-tabulation of the full-text index of forming inverted file rather than the method for original document data, so that shorten the required time of data transmission.
But along with the increase of the document data quantity of registering, the anti-tabulation of large-scale full-text index also becomes huge, and the small-sized full-text index that is used to search for and delete transmits the required time of data (merging of anti-tabulation is handled) also with elongated.
In the present embodiment, the time that the record data transmission is required, and when when the required time of the data transmission of a predetermined instant surpasses preset time, just change the condition that starts next data transmission, so that the small-sized full-text index that utilizes littler being used to search for or delete is carried out data transmission, thereby shortened the required time of data transmission, prevented that also the required time of data transmission is well beyond preset time simultaneously.
Figure 14 is the block scheme of the full-text search of explanation sixth embodiment of the invention.Figure 15 is the block diagram of hardware configuration of the unit example of expression full-text search device shown in Figure 14.Figure 16 is the block diagram of hardware configuration of the server/client schema instance of expression full-text search device shown in Figure 14.
Full-text search device according to the present invention is a device that is used for containing from a plurality of document datas (a plurality of electronic document) search the document of specific character string.
As shown in figure 14, in this 6th embodiment, be used for location registration process text data, be used to delete processing document identifier, be used to search for processing search condition or the like by input block 101 inputs, and offer location registration process parts 103, deletion processing element 104 and search processing element 105 respectively.Location registration process parts 103 are finished the location registration process of document data.103 pairs of document data memory section 107 of location registration process parts and small-sized full text registration index stores parts 109 are carried out location registration process.Deletion processing element 104 is finished the deletion of document data and is handled.In the deletion that deletion processing element 104 is carried out is handled,, read the document data that is stored in the document data memory section 107 by the document identifier that input block 101 is imported; And out-of-date when the registration in small-sized full text registration index stores parts 109 of its index, utilize text segmentation parts 106 these index of deletion; When this index is not registered in small-sized full text registration index stores device 109, this index of record in small-sized full text deletion index stores parts 110.In addition, text segmentation parts 106 are in location registration process, for location registration process parts 103 are divided into partial character string with document data, in deletion is handled, document data is divided into partial character string, and in search is handled, search condition (search string) is divided into partial character string for search processing element 105 in order to delete processing element 104.In addition, utilize search processing element 105, large-scale full-text search index stores parts 108, small-sized full text registration index stores parts 109 and small-sized full text deletion index stores parts 110 are carried out search handle.In search is handled, remove the Search Results of small-sized full text deletion index stores parts 110 from the Search Results of large-scale full-text search index stores parts 108 and small-sized full text registration index stores parts 109, the result of this deduction exports by an output block 102 as Search Results.
Merge the data transmission in the large-scale full-text search index stores parts 108 of parts (data transfer component) 111 realizations, small-sized full text registration index stores parts 109 and the small-sized full text deletion index stores parts 110.Merging time memory unit (data transmission period memory unit) 112 has write down the required time of data transmission.
In the unit hardware configuration shown in Figure 15, input media 121 is equivalent to the input block 101 among Figure 14, and display device 122 is equivalent to the output block 102 among Figure 14.Master control set (CPU, storer etc.) 124 is equivalent to location registration process parts 103, deletion processing element 104, search processing element 105, the text segmentation parts 106 among Figure 14 and merges parts 111.Memory storage 125 is equivalent to the document data memory section 107 among Figure 14, large-scale full-text search index stores parts 108, small-sized full text registration index stores parts 109, small-sized full text deletion index stores parts 110 and merges time memory unit 112.In addition, input-output control unit 123 is according to the control signal control input device 121 and the display device 122 of master control set 124.
In the hardware configuration of the server/customer end in Figure 16, the input media 131 of client 130 is equivalent to the input block 101 among Figure 14, and the display device 132 of client 130 is equivalent to the output block 102 among Figure 14.The master control set 152 (CPU, storer etc.) of the master control set of client 130 (CPU, storer etc.) 134 and server 150 is equivalent to location registration process parts 103, deletion processing element 104, search processing element 105, the text segmentation parts 106 among Figure 14 and merges parts 111.The memory storage 153 of server 150 is equivalent to the document data memory section 107 among Figure 14, large-scale full-text search index stores parts 108, small-sized full text registration index stores parts 109, small-sized full text deletion index stores parts 110 and merges time memory unit 112.In addition, the network control unit 135 of client 130 and the network control unit 151 of server 150 are by the data transmission between network 140 control clients 130 and the server 150 etc.Further, the input-output control unit 133 of client 130 is according to the control signal control input device 131 and the display device 132 of master control set 134.
To describe the as above example of the full-text search device operation of structure below in detail.
" location registration process "
In the execution of location registration process, at first, the user creates document data, and by data component 101 registration the document data.Location registration process parts 103 are stored the document data in document data memory section 107, and determine the identifier (document identifier) of expression the document data simultaneously.Further, these location registration process parts 103 utilize text segmentation parts 106 to obtain the appearance positional information of partial character string (mark) and this mark from document data.At last, location registration process parts 103 write down the appearance positional information of the document identifier and this mark in small-sized full text registration index stores parts 109.In addition, text segmentation parts 106 can use such as a N character group and constitute the method for a mark or carry out contour analysis and make a word constitute the method for a mark as dividing method.To describe text segmentation parts 106 below uses a N character group to constitute the example of the method for a mark; But these examples can use contour analysis to make a word constitute the method for a mark too.
Figure 17 is a block diagram that the processing of full-text search device shown in Figure 14 is described, and shows the example of a full-text index.With reference to example shown in Figure 17 in detail, the full-text index of inverted file method will be described.
The content of document data " document 1 " and " document 2 " (content that the dividing processing by text segmentation parts 106 obtains) in Figure 17 respectively by mark 161 and 162 expressions.In Figure 17, each document left-hand digit is represented the number of the character that begins from the reference position of character string.Clear and definite, " full-text search " in the document 1 (should be noted that among Figure 17 and represent with Chinese character) starts anew to appear at the position of the 11 character, " method " appears at the 20 and the position of the 60 character, and " text searching method " appears at the position of the 31 character.In document 2, " heuristic approach " starts anew to appear at the position of first character, and " method " appears at the position of the 24 character, and " in full " appears at the 30 and the position of the 42 character.
In addition, two character group are constituted under the situation of a partial character string (mark), partial character string all in the document all will extract, and the appearance position (number of the character that start anew) of each partial character string in document all is recorded in the index together.For example, in the document 1, " in full " appears at position 11 and 31 places, and " literary composition inspection " appear at position 12 and 32 places, and these positions just are documented in the index like this.Except the appearance position in document, for example in Figure 17, also be documented in the index by the document identifier and the occurrence number of the expression document of form shown in the mark 163.For example, the inverted file of " in full " promptly { 1,2 (11,31) } and { 2,2 (30,42) } represent respectively " in full " in document 1 in position 11 and 31 twice of appearance and " in full " twice of 30 and 42 appearance in document 2 in the position.
" deletion is handled "
In the process of carrying out the deletion processing, at first, the user imports the document identifier of the document that will delete by input block 101.Then, deletion processing element 104 reads document data corresponding to document identifier from document data memory section 107.Next, deletion processing element 104 utilizes text segmentation parts 106 acquisition partial character string (mark) and this to be marked at the positional information that occurs in the document data.In addition, out-of-date when document identifier registration in small-sized full text registration index, the appearance positional information of this mark is deletion from small-sized full text registration index stores parts 109 just.Do not register out-of-date (promptly when document identifier is registered in the large-scale full-text search index) when document identifier does not have in small-sized full text registration index, the appearance positional information of document identifier and this mark just is recorded in the small-sized full text deletion index store parts 110.In addition, deletion processing element 104 is deleted the document data corresponding to document identifier in document data memory section 107.
" search is handled "
In the process of carrying out the search processing, at first, the user is by input block 101 inputted search character strings.Then, search processing element 105 utilizes text segmentation parts 106 to obtain mark from search string.In addition, search processing element 105 is utilized the large-scale full-text search index of large-scale full-text search index stores parts 108 to obtain one group (Rs) to comprise the document identifier of the document data of search string, and utilizes the small-sized full text registration index of small-sized full text registration index stores parts 109 to obtain the document identifier that a group (Ri) comprises the document data of this search string.Next, this search processing element 105 utilizes the small-sized full text deletion index of small-sized full text deletion index stores parts 110 to obtain the document identifier that a group (Rd) comprises the document data of this search string.In order to obtain Search Results (R), group (the Rs of the document identifier of 105 pairs of acquisitions of this search processing element, Ri Rd) carries out following sequence of operations, and comprises the Search Results (R) of document identifier of the document data of this search string to user output as one group by output block 102.
R=Rs+Ri-Rd,
Wherein ,+presentation logic OR (or) operational character ,-presentation logic NOT (non-) operational character.
To describe the example of the search processing of use full-text index 163 shown in Figure 17 below in detail.
When search string was " full-text search ", text segmentation parts 106 extracted three marks " in full ", " literary composition inspection " and " retrieval ".Then, text segmentation parts 106 are checked three anti-tabulations corresponding to this mark in full-text index 163.This that finds from anti-tabulation is marked at different appearance position display " full-text search " in the document 1 and appears at position 11 and 31 in the document 1.
" merge and handle "
Handle the processing that has replaced the data transfer component described in the above-mentioned patented claim of being submitted to afterwards by merging merging that parts 111 carry out, described patented claim also is that the applicant by the application is submitted to.
Compare with the situation of utilizing the original document data to register/delete processing, when starting this processing, the anti-tabulation that direct utilization has been set up does not need to be used for dividing mark in the text segmentation processing and to create the anti-time of tabulating, and has so just shortened the time of data transmission.In the present invention, because this processing is carried out between anti-tabulation, data transmission and processing also can be called as to merge to be handled.When registration/deletion of the document data in the full-text search device is handled when instead carrying out between tabulation, the full-text search index is directly used the anti-tabulation that has existed in registration/deleted data, so just shorten the merging processing time of full-text search index, also reduced the stand-by period that search is handled simultaneously.
In carrying out the process that merges processing, at first, underlined for the institute in the small-sized full text registration index, carry out (a) extracts the anti-tabulation of these marks from full-text index processing, (b) end corresponding to the anti-tabulation of these marks in large-scale full-text search index adds the processing that these are instead tabulated.Then, empty small-sized full text registration index.In addition, underlined for the institute in the small-sized full text deletion index, carry out (c) extracts the anti-tabulation of these marks from full-text index processing, (d) from large-scale full-text search index corresponding to the appearance positional information of deleting in the anti-tabulation of these marks in the anti-tabulation that is included in these extractions.Afterwards, empty this small-sized full text deletion index.
Figure 18 handles the chart that illustrates as an example to the merging of the anti-tabulation of the mark in the full-text index shown in Figure 17 163 " in full ".
In this example, the anti-tabulation of " in full " i.e. { 1,2 (11,31) } and { 2,2 (30,42) } is anti-tabulation 171 in the full-text search index; " in full " anti-tabulation i.e. { 5,2 (4,16) } and { 8,1 (3) } is anti-tabulation 172 registering in the index in full; And counter tabulate 171 and anti-tabulation 172 be to merge to handle 173 process object.Carry out and merge the anti-tabulation 174 that processing 173 obtains " in full ", i.e. { 1,2 (11,31) }, { 2,2 (30,42) }, { 5,2 (4,16) }, { 8,1 (3) }.Next, to the anti-tabulation of anti-tabulation 174 and " in full ", promptly delete the anti-tabulation 176{1 in the index in full, 2 (11,31) }, carry out to merge and handle 175, obtain the anti-tabulation 177 of " in full ", i.e. { 2,2 (30,42) }, { 5,2 (4,16) } and { 8,1 (3) }.
" merging the form of handling 1 "
When satisfying entry condition, location registration process parts 103 start the merging of small-sized full text registration index to be handled, promptly when the quantity of the document identifier of registering in the small-sized full text registration index at small-sized full text registration index stores parts 109 reaches a predetermined value (Ni), merge parts 111 execution merging and handle.
In merging processing, when satisfying above-mentioned entry condition, at first, the start time (Ts) is write down (storage) in merging time memory unit 112.Then, for all marks in the small-sized full text registration index, carry out (a) extracts the anti-tabulation of these marks from full-text index processing, (b) end corresponding to the anti-tabulation of these marks adds the processing that these are instead tabulated in large-scale full-text search index.Then, empty small-sized full text registration index.In addition, underlined for the institute in the small-sized full text deletion index, carry out (c) extracts the anti-tabulation of these marks from full-text index processing, (d) from large-scale full-text search index corresponding to the appearance positional information of deleting in the anti-tabulation of these marks in the anti-tabulation that is included in these extractions.Next, empty this small-sized full text deletion index.Then the concluding time (Te) is stored in the merging time memory unit 112.
In above-mentioned merging was handled, when merging that the processing time, (Te-Ts) was greater than preset time (T), the next entry condition of handling that merges changed according to following expression (1).
Ni×(1-(Te-Ts)/T)——(1)
According to merging the form of handling 1, utilize the document identifier of registration just not need to manage the size of memory unit in full as entry condition, this just makes the processing that will carry out simpler.
" merging the form of handling 2 "
When satisfying entry condition, location registration process parts 103 start the merging of small-sized full text registration index to be handled, promptly when the memory capacity (size) at small-sized full text registration index stores parts 109 reaches a predetermined value (Si), and by merging parts 111 execution.In this merge to handle, merge formation described in the processing form 1 and merge the expression formula (1) of handling entry condition and replace by following expression (2).
Si×(1-(Te-Ts)/T)——(2)
When from the varying in size of user's document data, according to merging the form of handling 2, small-sized document data is constantly registered and had been used for before the time that small-sized full text registration index is registered is elongated, can stop to merge the startup of handling.Use size to be equal to as entry condition and the effect that merges the processing time.
" merging the form of handling 3 "
When satisfying entry condition, deletion processing element 104 starts the merging of small-sized full text deletion index and handles, promptly when the quantity of the document identifier in the small-sized full text deletion index that is registered in small-sized full text deletion index stores parts 110 reaches a predetermined value (Nd), handle by merging parts 111 execution merging.In this merge to handle, merge formation described in the processing form 1 and merge the expression formula (1) of handling entry condition and replace by following expression (3).
Nd×(1-(Te-Ts)/T)——(3)
According to merging the form of handling 3, when not frequent generation was handled in deletion, it had the advantage that shortening merges the time of handling.
" merging the form of handling 4 "
When reaching entry condition, deletion processing element 104 starts the merging of small-sized full text deletion index and handles, promptly when the size of the small-sized full text deletion index that is registered in small-sized full text deletion index stores parts 110 reaches a predetermined value (Sd), handle by merging parts 111 execution merging.In this merge to handle, merge formation described in the processing form 1 and merge the expression formula (1) of handling entry condition and replace by following expression (4).
Sd×(1-(Te-Ts)/T)——(4)
According to merging the form of handling 4, when not frequent generation was handled in deletion, it had the advantage that shortening merges the time of handling.
According to the form that above-mentioned merging is handled, the full-text search device can be in the feature of the document data that is suitable for will being registered/delete and/or the merging processing of this full-text index of startup under the condition of the feature of a field when using this device.So just reduced to merge and handled the number of times that takes place, improved the processing power of entire system simultaneously.Further, even when the quantity of the document of registering increases, the full-text search device will merge to be handled required time qualifiedly in a preset time, and so just the stand-by period with other processing is defined as in the preset time, has further improved the processing power of entire system thus.
In addition, the present invention not only is applicable to above-mentioned full-text search device, and it also can be the functional programs that is used for realizing above-mentioned full-text search device, form the functions of components program of this full-text search device or store the computer readable recording medium storing program for performing of this program.
embodiment 7 〉
To describe an example below, be used for wherein realizing that the functional programs of full-text search device of the present invention and data storage are at computer-readable recording medium.Especially, this computer-readable recording medium can be CD-ROM, magneto-optic disk, DVD-ROM, FD, flash memory and various types of ROM and RAM, or the like.Make the function of full-text search device that computing machine carries out previous embodiment so that realize the functional programs of full-text search, be recorded in the recording medium that is used for selling, facilitate like this and realize these functions.By above-mentioned recording medium being installed in the signal conditioning package such as computing machine, and read described program by this signal conditioning package, perhaps with this procedure stores in the storage medium that this signal conditioning package provides, and read this program when needed, can realize the function of full-text search of the present invention.
embodiment 8 〉
Figure 19 is the block scheme of an explanation according to the full-text search device of eighth embodiment of the invention.
Full-text search device according to present embodiment also can be device that comprises the document of a designated character string of search from a plurality of document datas (a plurality of electronic document).In addition, " full-text search " in the full-text search device is meant all character strings that will search for of search.Thus, in having the document of label, for example SGML, the only search fully that the character string between preset label can be correct.
As shown in Figure 19, in the present embodiment, be used for location registration process text data, be used to delete processing document identifier, be used to search for processing search condition or the like by input block 201 inputs, and offer location registration process parts 203, deletion processing element 204 and search processing element 205 respectively.Location registration process parts 203 are finished the location registration process of document data.203 pairs of document data memory section 207 of location registration process parts and small-sized full text registration index stores parts 209 are carried out location registration process.Deletion processing element 204 is finished the deletion of document data and is handled.In the deletion that deletion processing element 204 is carried out was handled, the document identifier of being imported according to input block 201 read the document data that is stored in the document data memory section 207; And when relative index has been registered in the small-sized full text registration index stores parts 209, utilize text segmentation parts 206 these index of deletion; Unregistered when small-sized full text is registered in the index stores parts 209 when this index, just this index record is deleted in the index stores parts 210 at small-sized full text.In addition, text segmentation parts 206 are divided into partial character string for location registration process parts 203 with document data in location registration process, in deletion is handled, document data is divided into partial character string, and in search is handled, search condition (search string) is divided into partial character string for search processing element 105 for deleting processing element 204.
In addition, 205 pairs of large-scale full-text search index stores parts 208 of search processing element, small-sized full text registration index stores parts 209 and small-sized full text deletion index stores parts 210 are carried out to search for and are handled.In search is handled, the Search Results of small-sized full text deletion index stores parts 210 will be removed from the Search Results of large-scale full-text search index stores parts 208 and small-sized full text deletion index stores parts 210, and the result after this is removed exports as Search Results by an output block 202.Merge the data transmission (being the data transmission of broad sense) in the large-scale full-text search index stores parts 208 of parts 211 realizations, small-sized full text registration index stores parts 209 and the small-sized full text deletion index stores parts 210.
In addition, deletion processing element 204 can not used small-sized full text deletion index stores parts 210, but only use small-sized full text registration index stores parts 209, and use the deletion management method of other document data (and index) therein, carry out deletion and handle; For example, only delete the document data that to delete, and at idle state etc., utilize a large amount of processing times, by be stored in document data in the document data memory section 207 and be complementary and upgrade document data in the large-scale full-text search index stores parts 208.Opposite, deletion processing element 204 can not used small-sized full text registration index stores parts 209, but only uses small-sized full text deletion index stores parts 210, carries out deletion and handles.
In the unit hardware configuration shown in above-mentioned Figure 15, input media 121 is equivalent to the input block 201 among Figure 19, and display device 122 is equivalent to the output block 202 among Figure 19.Master control set (CPU, storer etc.) 124 is equivalent to location registration process parts 203, deletion processing element 204, search processing element 205, the text segmentation parts 206 among Figure 19 and merges parts 211.Memory storage 125 is equivalent to all document data memory section 207 shown in Figure 19, large-scale full-text search index stores parts 208, small-sized full text registration index stores parts 209 and small-sized full text deletion index stores parts 210; Perhaps single memory is equivalent to each memory member 207 to 210 shown in Figure 19; Perhaps the file in the storage arrangement 125 is equivalent to the memory member 207 to 210 shown in Figure 19.For example, when utilizing a limited storage arrangement to carry out full-text search of the present invention, according to being main execution search processing or mainly carrying out registration/deletion processing and distribute its employed zone rightly.In addition, input-output control unit 123 is according to the control signal control input device 121 and the display device 122 of master control set 124.
In the hardware configuration of the server/customer end pattern in above-mentioned Figure 16, the input media 131 of client 130 is equivalent to the input block 201 among Figure 19, and the display device 132 of client 130 is equivalent to the output block 202 among Figure 19.The master control set of client 130 and server 150 (CPU, storer etc.) 134 and 152 is equivalent to the location registration process parts 203 among Figure 19, the deletion processing that is used to delete processing element 204, search processing element 205, text segmentation parts 206 and merges parts 211.The memory storage 153 of server is equivalent to document data memory section all among Figure 19 207, large-scale full-text search index stores parts 208, small-sized full text registration index stores parts 209 and small-sized full text deletion index stores parts 210; Perhaps the single memory that links to each other with server 150 is equivalent to each memory member 207 to 210 shown in Figure 19; Perhaps the file in the memory storage 153 is equivalent to the memory member 207 to 210 shown in Figure 19.In addition, the network control unit 135 of client 130 and the network control unit 151 of server 150 are by the data transmission between network 140 control clients 130 and the server 150 etc.Further, the input-output control unit 133 of client 130 is according to the control signal control input device 131 and the display device 132 of master control set 134.
Next, with the example of detailed description according to the operation of the full-text search device of the 8th embodiment.
Figure 20 to 22 is process flow diagrams of handling example in the explanation full-text search device shown in Figure 19.
When the full-text search device receives processing request from the user (step ST1), at first, the full-text search device is judged whether location registration process (step ST2) of this processing, and whether this processings deletes processing (step ST3), and whether this processing searches for processing (denying among the step ST3).Carry out one of following processing according to these judgement full-text search devices.
" location registration process "
In the execution of location registration process, at first, the user creates document data, and by input block 201 registration the document data.Location registration process parts 203 are stored the document data in document data memory section 207, and determine the identifier (document identifier) (the step ST11 among Figure 21) of expression the document data simultaneously.For example, for a document that comprises the label of all SGML in this way, only could in this processing, correctly be handled in the character string between the preset label.Further, these location registration process parts 203 utilize text segmentation parts 206 to obtain the appearance positional information (step ST12) of partial character string (mark) and this mark from document data.At last, location registration process parts 203 write down the appearance positional information (step ST13) of the document identifier and this mark in small-sized full text registration index stores parts 209." record " among the step ST13 is meant the record of the full-text index in the memory unit (identical with following use), and the processing among the step ST13 refers to also is the index stores step.In addition, text segmentation parts 206 for example can use that a N character group constitutes the method for a mark or carries out contour analysis and make a word constitute the method for a mark as dividing method.To describe text segmentation parts 206 below uses a N character group to constitute the example of the method for a mark; But these examples are applicable to too and carry out contour analysis and make a word constitute the method for a mark.In addition, merge the record of handling in step ST13 and carry out timely afterwards, it will be described below.
In addition, utilize the method for the full-text index of the inverted file that text segmentation parts 206 carry out, identical with top description in fact with reference to Figure 17.
" deletion is handled "
In the execution that deletion is handled, at first, the user imports the document identifier of the document that will delete by input block 201.Then, deletion processing element 104 reads document data (the step ST21 Figure 22) corresponding to document identifier from document data memory section 207.Next, deletion processing element 204 utilizes text segmentation parts 206 acquisition partial character string (mark) and this to be marked at the positional information (step ST22) that occurs in the document data.For example, for a document that comprises the label of all SGML in this way, only could in this processing, correctly be handled in the character string between the preset label.Deletion processing element 204 judges whether document identifier is registered in the small-sized full text registration index (step ST23).When document identifier was registered in the small-sized full text registration index, the appearance positional information of this mark is deletion (step ST25) from small-sized full text registration index stores parts 209 just.When document identifier is not registered in the small-sized full text registration index (when document identifier is registered in the large-scale full-text search index), the appearance positional information of document identifier and this mark just is recorded in the small-sized full text deletion index store parts 210 (step ST24).Next, deletion processing element 204 is deleted the document data (step ST29) corresponding to document identifier in document data memory section 207.In addition, merge the record of handling in step ST24 and carry out (step ST26-ST28) afterwards timely, it will be described below.
" search is handled "
In the process of carrying out the search processing, at first, the user is by input block 201 inputted search character strings.Then, search processing element 205 utilizes text segmentation parts 206 to obtain mark (the step ST4 among Figure 20) from search string.In addition, search processing element 205 utilizes the large-scale full-text search index of large-scale full-text search index stores parts 108 to obtain the document identifier (step ST5) that a group (Rs) comprises the document data of search string, and utilizes the small-sized full text registration index of small-sized full text registration index stores parts 209 to obtain the document identifier (step ST6) that a group (Ri) comprises the document data of this search string.Next, this search processing element 205 utilizes the small-sized full text deletion index of small-sized full text deletion index stores parts 210 to obtain the document identifier (step ST7) that a group (Rd) comprises the document data of this search string.In order to obtain Search Results (R), group (the Rs of the document identifier of 205 pairs of acquisitions of this search processing element, Ri, Rd) carry out following sequence of operations, and comprise the Search Results (R) (step ST9) of document identifier of the document data of this search string by output block 102 to user output as one group.
R=Rs+Ri-Rd,
Wherein ,+presentation logic OR operational character ,-presentation logic NOT operational character.
In addition, the search that utilizes text segmentation parts 206 to carry out is handled identical in fact with the example of top employing full-text index 163 with reference to Figure 17 description.
" merge and handle "
Handle the processing that has replaced the data transfer component described in the above-mentioned patented claim of being submitted to afterwards by merging merging that parts 211 carry out, described patented claim also is that the applicant by the application is submitted to.
Compare with the situation of utilizing the original document data to register/delete processing, when starting this processing, the anti-tabulation that direct utilization has been set up does not need to be used for dividing mark in the text segmentation processing and to create the anti-time of tabulating, and has so just shortened the time of data transmission and processing.In the present invention, because this processing is carried out between anti-tabulation, data transmission and processing (data transmission step) also can be called as to merge handles (combining step).When registration/deletion of the document data in the full-text search device is handled when instead carrying out between tabulation, the full-text search index is directly used the anti-tabulation that has existed in registration/deleted data, so just shorten the merging processing time of full-text search index, also reduced the stand-by period that search is handled simultaneously.
In carrying out the process that merges processing, at first, underlined for the institute in the small-sized full text registration index, carry out (a) extracts the anti-tabulation of these marks from full-text index processing (the step ST14 among Figure 21), (b) end corresponding to the anti-tabulation of these marks in large-scale full-text search index adds the processing (step ST15) that these are instead tabulated.Then, empty small-sized full text registration index (step ST16).In addition, underlined for the institute in the small-sized full text deletion index, carry out (c) extracts the anti-tabulation of these marks from full-text index processing (the step ST26 among Figure 22), (d) from large-scale full-text search index corresponding to the appearance positional information (step ST27) of deleting in the anti-tabulation of these marks in the anti-tabulation that is included in these extractions.Afterwards, empty this small-sized full text deletion index (step ST28).
In addition, handle identical in fact according to the merging of the anti-tabulation of present embodiment with top description example with reference to Figure 18.
" merging the form of handling 1 "
When the quantity of the document identifier of registering in the small-sized full text registration index at small-sized full text registration index stores parts 209 reaches a predetermined value, merge and handle, and carry out by merging parts 211 by 203 startups of location registration process parts.
According to merging the form of handling 1, utilize the quantity of the document identifier of registration just not need to manage the size of memory unit in full as entry condition, this just makes the processing that will carry out simpler.
" merging the form of handling 2 "
When the memory capacity (size) at small-sized full text registration index stores parts 209 reaches a predetermined value, merge and handle, and carry out by merging parts 211 by 203 startups of location registration process parts.When from the varying in size of user's document data, according to merging the form of handling 2, small-sized document data is constantly registered and is being used for can preventing to merge the startup of processing before the time that small-sized full text registration index is registered is elongated.Use size to be equal to as entry condition and the effect that merges the processing time.
" merging the form of handling 3 "
When the quantity that satisfies the document identifier of registering in small-sized full text deletion index reached the entry condition of a predetermined value, the merging of small-sized full text deletion index was handled by deletion processing element 204 and is started, and carried out by merging parts 211.
" merging the form of handling 4 "
When the size that satisfies small-sized full text deletion index stores parts 210 reached the entry condition of a predetermined value, the merging of small-sized full text deletion index was handled by deletion processing element 204 and is started, and carried out by merging parts 211.
According to merging the form of handling 3 and 4, when not frequent generation was handled in deletion, it had the advantage that shortening merges the time of handling.
According to the form that above-mentioned merging is handled, the merging processing of this full-text index of startup under the condition that the full-text search device can be in the feature of the document data that will be registered/delete and/or the feature of a field is suitable when using this device.So just reduced to merge and handled the number of times that takes place, and improved the processing power of entire system simultaneously.In addition, the merging that comprises location registration process is handled and is comprised the merging that deletion handles and handles and can start simultaneously under arbitrary entry condition.
Like this, in the full-text search device of present embodiment, be used for data be transferred to from small-sized full-text index large-scale full-text index data transfer component (merge parts 211) transmission is not the original document data, but the required time of data transmission has so just been shortened in the anti-tabulation of composition full-text index in the inverted file method.
<embodiment 9 〉
Below, with the full-text search device of describing in detail according to ninth embodiment of the invention, wherein the full-text search equipment according to front embodiment has been used write delay data base management method/device, it is applied for obtaining in another patented claim describing the applicant by current application.According to the full-text search device solves of current embodiment following problem, when carrying out data and transmit (merging of anti-tabulation is handled) from the small-sized full-text index that is used to register or is used to delete to the large-scale full-text index that is used to search for, the described small-sized full-text index memory unit that is used to register or is used to delete can not use, and can not carry out location registration process simultaneously or deletion is handled.
Accompanying drawing 23 is the block schemes that are used to explain according to the full-text search device of ninth embodiment of the invention.
In the full-text search device of ninth embodiment of the invention, have two small-sized full-text index memory units that are used to register and two small-sized full-text index memory units that are used to delete, so that, merge (data transmission) when carrying out, when a small-sized full-text index merges to large-scale full-text index, another small-sized full-text index can be used to carry out location registration process or deletion is handled, and thus, has got rid of the time of inexecutable processing.Just, in the full-text search device of ninth embodiment of the invention, two small-sized full-text indexs that are used to register that provide can make location registration process can or even merge to handle and to carry out when carrying out, and two small-sized full-text indexs that are used to delete that provide can so that deletion handle can or even merging to handle and carrying out when carrying out.Current the 9th embodiment is more suitable for handling situation about carrying out continually continuously in location registration process and with the merging that is contained in wherein, and such as when reading document by scanner etc., it stands OCR handles, and proceeds to register.Full-text search becomes possible to view data and the common application data that have high-speed response like this.
In the 9th current embodiment, comprise two memory units, promptly small-sized full text registration index stores components A (209a) and small-sized full text registration index stores part B (209b) with reference to accompanying drawing 19 described small-sized full text registration index stores parts 209.Small-sized full text deletion index stores parts 210 with reference to accompanying drawing 19 is described comprise two memory units, promptly small-sized full text deletion index stores components A (210a) and small-sized full text deletion index stores part B (210b).In addition, the hardware configuration shown in the accompanying drawing 15 and 16 also is applicable to the full-text search device of ninth embodiment of the invention.In addition, also can in storer rather than in memory storage 125 or 153, provide one or more above-mentioned memory units.
Below, with the example of detailed description according to the operation of the full-text search device of ninth embodiment of the invention.
Figure 24 to Figure 26 is the process flow diagram of example that is used for illustrating the processing procedure of the full-text search device shown in the accompanying drawing 23.
When the full-text search device receives one during from user's processing request (step ST31), at first, the full-text search device judges that whether this processing is whether location registration process (step ST32), this processing be that whether handle (step ST33) and this processings be that (denying among the step ST33) handled in search in deletion.The full-text search device is judged the following arbitrary processing of execution according to this.
" location registration process "
In carrying out location registration process, at first, the user creates document data, and by input block 201 registration the document data.Location registration process parts 203 are stored the document data in document data memory section 207, simultaneously, determine the identifier (document identifier) (the step ST41 in the accompanying drawing 25) of an indication document data.Further, utilize text segmentation parts 206, location registration process parts 203 obtain partial character string (mark) and are marked to occur positional information (step ST42) in the document data.In addition, dividing method and full-text index and above-described identical.The appearance positional information of location registration process parts 203 document identifier and mark is recorded in (for example, small-sized full text registration index stores components A (209a)) (step ST43) in the used at that time small-sized full text registration index stores parts.
After the recording step of step ST43, merge to handle and just in time finish.In present example, carry out merging according to predetermined merging initial conditions and handle.At first, the result of the recording step of step ST43 is that location registration process parts 203 judge whether merge initial conditions satisfies (step ST44).When satisfy not merging initial conditions (among the step ST44 not), processing finishes.In addition, the initial conditions of described in the aforementioned embodiment merging processing form also are applicable to the 9th current embodiment.When satisfy merging initial conditions (being among the step ST44), location registration process parts 203 judge whether another small-sized full text registration index stores parts (in this example, being small-sized full text registration index stores part B (209b)) are merging processing (step ST45).When another small-sized full text registration index stores part B (209b) is merging when handling (being among the step ST45), processing to be combined such as location registration process parts 203 finishes.
When satisfying the merging initial conditions, and another small-sized full text registration index stores part B (209b) does not merge when handling (among the step ST45 not), location registration process parts 203 start following merging according to the small-sized full text registration index A in the small-sized full text registration index stores components A (209a) and handle (step ST47-ST49), and, be transformed into another small-sized full text registration index stores part B (209b) (step ST46) from small-sized full text registration index stores components A (209a) with the used memory unit of record that next location registration process will be carried out.When starting the merging processing, merge parts 211 and merge processing to carry out asynchronously mutually with location registration process parts 203.
" deletion is handled "
In the process of carrying out the deletion processing, at first, the user imports the document identifier of the document that will delete by input block 201.Then, deletion processing element 204 reads document data (the step ST51 in the accompanying drawing 26) according to document identifier from document data memory section 207.Further, deletion processing element 204 utilizes text segmentation parts 206 to obtain the appearance positional information (step ST52) of partial character string (mark) and mark from document data.
Then, deletion processing element 204 judges whether document identifier is registered in small-sized full text registration index (step ST53).When document identifier is registered, just register the appearance positional information (ST55) of this mark of deletion index stores parts 209 (209a and the 209b) from small-sized full text in small-sized full text registration index.When document identifier is not registered in small-sized full text registration index (, when document identifier is registered in the large-scale full-text search index), just the appearance positional information of document identifier and mark is recorded in the small-sized full text deletion index stores parts at that time (for example, in the small-sized full text deletion index stores components A (210a)) (step ST54).Subsequently, deletion processing element 204 is deleted the document data according to document identifier from document data memory section 207.(step ST62).
After the recording step of step ST54, merge to handle and just in time finish.In present example, carry out merging according to predetermined merging initial conditions and handle.At first, as the result who writes down in step ST54, deletion processing element 204 judges whether to satisfy merging initial conditions (step ST56).When satisfy not merging initial conditions (among the step ST56 not), after execution of step ST62, this processing finishes.In addition, as mentioned above, the initial conditions of the merging processing form in the previous embodiment also are applicable to the 9th current embodiment.When satisfy merging initial conditions (being among the step ST56), deletion processing element 204 judges whether another small-sized full text deletion index stores parts (in this example, being small-sized full text deletion index stores part B (210b)) are merging processing (step ST57).When another small-sized full text deletion index stores part B (209b) is merging when handling (being among the step ST57), processing to be combined such as deletion processing element 204 finish.
When satisfying the merging initial conditions, and (denying among the step ST57) do not carried out and merged when handling to another small-sized full text deletion index stores part B (210b), deletion processing element 204 starts following merging according to the small-sized full text deletion index A in the small-sized full text deletion index stores components A (210a) and handles (step ST59-ST61), and next one deletion handled the memory unit of the record that will carry out, be transformed into another small-sized full text from small-sized full text deletion index stores components A (210a) and delete index stores part B (210b) (step ST58).When starting the merging processing, merge parts 211 and carry out the merging processing asynchronously with deletion processing element 204.
" search is handled "
In the implementation that search is handled, at first, the user is by input block 201 inputted search character strings.Then, search processing element 205 utilizes text segmentation parts 206 to obtain mark (the step ST34 in the accompanying drawing 24) from search string.In addition, search processing element 205 is utilized the large-scale full-text search index in the large-scale full-text search index stores parts 208, obtains the document identifier (step ST35) that a group (Rs) contains the document data of search string.Search processing element 205 is utilized the small-sized full text registration index A in the small-sized full text registration index stores components A (209a), obtain the document identifier that a group (RiA) contains the document data of search string, and utilize the small-sized full text registration index B in the small-sized full text registration index stores part B (209b), obtain the document identifier that a group (RiB) contains the document data of search string.Further, search processing element 205 is utilized the small-sized full text deletion index A in the small-sized full text deletion index stores components A (210a), obtain the document identifier that a group (RdA) contains the document data of search string, and utilize the small-sized full text in the small-sized full text deletion index stores part B (210b) to delete index B, obtain the document identifier (step ST37) that a group (RdB) contains the document data of search string.
A plurality of groups of (Rs of the document identifier of 205 pairs of acquisitions of search processing element, RiA, RiB, RdA, RdB) carry out following one group of operation, obtaining Search Results (R) (step ST38), and one group of document identifier that contains the document data of search string is exported to user (step ST39) as described Search Results (R) by output block 202.
R=Rs+RiA+RiB-RdA-RdB,
Wherein ,+represent logic OR operational character ,-represent logic NOT operational character.
" merge and handle "
In the implementation that the merging of small-sized full text registration index is handled, register index (in this embodiment according to being used to start the small-sized full text that merges processing, be small-sized full text registration index A) in all marks, carry out (a) extracts the processing (the step ST47 in the accompanying drawing 25) of the anti-tabulation of mark and (b) anti-tabulation be added to the end (step ST48) of the anti-tabulation of respective markers in the large-scale full-text search index from full-text index processing.Then, empty small-sized full text registration index A (step ST49).
In the implementation that the merging of small-sized full text deletion index is handled, delete index (in this example according to being used for starting the small-sized full text that merges processing, be small-sized full text deletion index A) in all marks, carry out (c) extracts the processing (the step ST59 in the accompanying drawing 26) of the anti-tabulation of mark and (d) the appearance positional information that is included in the anti-tabulation of extraction deleted in the anti-tabulation corresponding to the mark in the large-scale full-text search index from full-text index processing (step ST60) from large-scale full-text search index.Then, empty small-sized full text deletion index A (step ST61).
In addition, handle according to the merging of the anti-tabulation of the embodiment of the invention, identical with the foregoing description that reference accompanying drawing 18 has been done.
<embodiment 10 〉
Below, with reference to the accompanying drawings 27 to 30, description is according to the full-text search device of tenth embodiment of the invention, and wherein three or more small-sized full text registration index stores parts and/or three or more small-sized full text deletion index stores parts are used to according to the full-text search device among aforesaid the 9th embodiment.
Accompanying drawing 27 is explanation block schemes according to the full-text search device of tenth embodiment of the invention.
In the full-text search device according to tenth embodiment of the invention, three or more small-sized full-text indexs that are used to register and the three or more small-sized full-text index that is used to delete (all being described as three in this example) are arranged, like this, when the merging (data transfer) carried out from two small-sized full-text index to large-scale full-text indexs, use another small-sized full-text index to finish location registration process or deletion processing, thus, eliminated the inexecutable time of processing.Promptly, in full-text search device according to tenth embodiment of the invention, a plurality of small-sized full-text indexs that are used to register are provided, so that its term of execution merge handling and or even other location registration process the term of execution, can carry out location registration process according to another small-sized full-text index that is used to register; A plurality of small-sized full-text indexs that are used to delete also are provided so that its or even the processing that carry out to merge during and carrying out other deletions and handling during, can carry out deletion according to another small-sized full-text index that is used to delete and handle.In fact, because registration or deletion are to handle than merging in the short time,, supposition takes place simultaneously so merging to handle more continually.
In current the tenth embodiment, the small-sized full text registration index stores parts of describing with reference to accompanying drawing 19 209 comprise three memory units, promptly small-sized full text registration index stores components A (209a), small-sized full text registration index stores part B (209b) and small-sized full text registration index stores parts C (209c).With reference to accompanying drawing 19 described small-sized full text deletion index stores parts 210, comprise three memory units, be small-sized full text deletion index stores components A (210a), small-sized full text deletion index stores part B (210b) and small-sized full text deletion index stores parts C (210c).In addition, the hardware configuration shown in the accompanying drawing 15 and 16 also is applicable to the full-text search device according to tenth embodiment of the invention.And, in storer rather than in memory storage 125 or 153, provide one or more above-mentioned memory units, also be fine.
Below, with the example of detailed description according to the operation of the full-text search device of tenth embodiment of the invention.
Accompanying drawing 28 to 30 is process flow diagrams of the example of the processing in the full-text search device shown in the explanation accompanying drawing 27.
When the full-text search device receives one during from user's processing request (step ST71), at first, the full-text search device judges whether this processing is location registration process (step ST72), and whether this processing is that (step ST73) handled in deletion, and whether this processings is that (denying among the step ST73) handled in search.The full-text search device is also judged the following arbitrary processing of execution according to this.
" location registration process "
In the location registration process implementation, at first, the user creates document data, and by input block 201 registration the document data.Location registration process parts 203 are stored document data in document data memory section 207, simultaneously, determine the identifier (document identifier) (the step ST81 in the accompanying drawing 29) of indication document data.Further, utilize text segmentation parts 206, location registration process parts 203 acquisition partial character string (mark) and this are marked at the appearance positional information (step ST82) in the document data.In addition, dividing method and full-text index and above-described identical.Small-sized full text registration index stores parts (for example, small-sized full text registration index stores components A (209a)) the middle record document identifier that location registration process parts 203 are used at that time and the appearance positional information (step ST83) of this mark.
After the recording step of step ST83, merge to handle and just in time finish.In present example, carry out merging according to predetermined merging initial conditions and handle.At first, as the result who writes down in step ST83, location registration process parts 203 judge whether to satisfy merging initial conditions (step ST84).When satisfy not merging initial conditions (among the step ST84 not), processing finishes.In addition, the initial conditions of merging processing form in the aforementioned embodiment also are applicable to the tenth current embodiment.When satisfy merging initial conditions (being among the step ST84), location registration process parts 203 judge whether another small-sized full text registration index stores parts (in this example, being small-sized full text registration index stores part B (209b)) are merging processing (step ST85).When small-sized full text registration index stores part B (209b) is merging when handling (being among the step ST85), location registration process parts 203 judge whether the 3rd small-sized full text registration index stores parts (in this example, being small-sized full text registration index stores parts C (209c)) are merging processing (step ST86).And in step ST85 and step ST86, location registration process parts 203 judge also whether each small-sized full text registration index stores part B (209b) and C (209c) are carrying out location registration process.When small-sized full text registration index stores parts C (209c) are merging when handling (being among the step ST86), processing to be combined such as location registration process parts 203 finishes.Explained that great majority are about merging the supposed situation of handling judgement here.
Merge initial conditions when satisfying, and one of them of two other small-sized full text registration index stores part B (209b) and C (209c) do not merge when handling (denying among the step ST85/ST86), location registration process parts 203 are according to the small-sized full text registration index A in the small-sized full text registration index stores components A (209a), start and the same merging processing (step ST89-ST91) of step ST47-ST49 in the accompanying drawing 25, and the memory unit of the record that will will carry out for next location registration process, index stores part B (209b)/C (209c) (promptly to be transformed into another small-sized full text registration from small-sized full text registration index stores components A (209a), the memory unit that does not merge processing, expression in the same way hereinafter) (step ST87/ST88).When starting the merging processing, merging parts 211 are carried out asynchronously with location registration process parts 203 and are merged processing.
" deletion is handled "
In the implementation that deletion is handled, at first, the user imports the document identifier of the document that will delete by input block 201.Then, deletion processing element 204 reads document data (the step ST101 in the accompanying drawing 30) according to document identifier from document data memory section 207.Further, deletion processing element 204 utilizes text segmentation parts 206 to obtain the appearance positional information (step ST102) of partial character string (mark) and this mark from document data.
Then, deletion processing element 204 judges whether document identifier is registered in small-sized full text registration index (step ST103).Registered when small-sized full text is registered in the index when document identifier, then register the appearance positional information (ST105) of this mark of deletion among index stores parts 209 (209a, 209b and the 209c) from small-sized full text.When document identifier is not registered in the small-sized full text registration index (, when document identifier is registered in the large-scale full-text search index), the appearance positional information of document identifier and mark is recorded at that time the small-sized full text deletion index stores parts (for example, in the small-sized full text deletion index stores components A (210a)) (step ST104).Subsequently, deletion processing element 204 is deleted the document data (step ST114) corresponding to document identifier from document data memory section 207.
After the recording step of step ST104, merge to handle and just in time finish.In present example, carry out merging according to predetermined merging initial conditions and handle.At first, as the result at the recording step of step ST104, deletion processing element 204 judges whether to satisfy and merges initial conditions (step ST106).When satisfy not merging initial conditions (among the step ST106 not), after execution of step ST114, this processing finishes.In addition, as mentioned above, the initial conditions of the merging processing form of previous embodiment also are applicable to the tenth current embodiment.When satisfy merging initial conditions (being among the step ST106), deletion processing element 204 judges whether another small-sized full text deletion index stores parts (in this example, being small-sized full text deletion index stores part B (210b)) are merging processing (step ST107).When small-sized full text deletion index stores part B (209b) is merging when handling (being among the step ST107), deletion processing element 204 judges whether the 3rd small-sized full text deletion index stores parts (in this embodiment, being small-sized full text deletion index stores parts C (210c)) are merging processing (step ST108).In addition, in step ST107 and ST108, deletion processing element 204 judges also whether each small-sized full text deletion index stores part B (210b) and C (210c) are deleting processing.When small-sized full text deletion index stores parts C (210c) are merging when handling (being among the step ST108), processing to be combined such as described deletion processing element 204 finishes.Explained that great majority are about merging the supposed situation of the judgement of handling here.
Merge initial conditions when satisfying, and any among two other small-sized full text deletion index stores part B (210b) and the C (210c) do not merge when handling (denying among the step ST107/ST108), deletion processing element 204 is according to the small-sized full text deletion index A in the small-sized full text deletion index stores components A (209a), start the merging the same and handle (step ST111-ST113) with the step ST59-ST61 shown in the accompanying drawing 26, and next one deletion handled the memory unit of the record that will carry out, be transformed into another small-sized full text from small-sized full text deletion index stores components A (210a) and delete index stores part B (209b)/C (209c) (step ST109/ST110).When starting the merging processing, merge parts 211 and carry out the merging processing asynchronously with deletion processing element 204.
" search is handled "
The search of tenth embodiment of the invention is handled basically and is handled similar to the top search of describing with reference to accompanying drawing 24, and the step ST34 in the accompanying drawing 24 to ST39 respectively with accompanying drawing 28 in step ST74 consistent to ST79, just, in step ST76, search processing element 205, not only obtain described group of (RiA, RiB), also utilize the small-sized full text registration index C of small-sized full text registration index stores parts C (209c), obtain the document identifier (step ST76) that a group (RiC) contains the document data of search string.Further, in step ST77, search processing element 205 not only obtains described group of (RdA, RdB), also utilize the small-sized full text deletion index stores parts C among the small-sized full text deletion index stores parts C (210c), obtain the document identifier (step ST77) that a group (RdC) comprises the document data of search string.205 couples of a plurality of groups of (Rs that obtain document identifier of search processing element, RiA, RiB, RiC, RdA, RdB RdC) carries out following one group of operation, obtaining Search Results (R) (step ST78), and one group of document identifier that contains the document data of search string is exported to user (step ST79) as described Search Results (R) by output block 202.
R=Rs+RiA+RiB+RiC-RdA-RdB-RdC,
Wherein ,+represent logic OR operational character ,-represent logic NOT operational character.
<embodiment 11 〉
In the aforementioned the 9th and the tenth embodiment, the full-text search device uses a plurality of small-sized full text registration index stores parts and/or a plurality of small-sized full text deletion index stores parts.Below, 31 to accompanying drawing 34 with reference to the accompanying drawings, description is according to the full-text search device of eleventh embodiment of the invention, it also is applicable to following this situation, these full-text index memory units (except large-scale full-text search index stores parts) are arranged in memory storage 125 shown in accompanying drawing 15 or the accompanying drawing 16 or 153 isolated memory area, or in the independent memory block in storer, be applicable to full-text index memory unit (except large-scale full-text search index stores parts) for be stored in memory storage 125 or 153 or storer in the situation of independent document.
Accompanying drawing 31 is explanation block schemes according to the full-text search device of eleventh embodiment of the invention.
In full-text search device according to eleventh embodiment of the invention, prepare a small-sized full-text index that is used to register and a small-sized full-text index that is used to delete in advance, so that, in registration/deletion processing procedure, when not having the full text registration/deletion memory unit that to store full text registration/deletion index, for example when the merging of carrying out from small-sized full-text index to large-scale full-text index (data transfer), newly-built another small-sized full-text index is to carry out location registration process or deletion processing, thus, avoided handling the time period that to carry out.Promptly, in full-text search device according to eleventh embodiment of the invention, a plurality of small-sized full-text indexs that are used to register in time are provided, so as or even carry out the merging of a plurality of small-sized full-text indexs that are used to register handled during and or even carrying out other location registration process during, also can carry out location registration process; Similarly, in time provide a plurality of small-sized full-text indexs that are used to delete, make its can or even carry out the merging of a plurality of small-sized full-text indexs that are used to delete handled during and or even carrying out other and handle deletion and handle during, carry out deletion and handle.In fact, because registration or deletion are to handle than merging to handle in the shorter time, taking place simultaneously so supposition merges to handle more continually.
The full-text search device of eleventh embodiment of the invention comprises memory unit management component 212, another small-sized full text registration index stores parts different with small-sized full text registration index stores components A (209a) of its management.In addition, memory unit management component 212 also manages another small-sized full text deletion index stores parts different with small-sized full text deletion index stores components A (210a) in deletion is handled.When in location registration process, there not being to store the full text registration index stores parts of registering in full index the time the small-sized full text registration of memory unit management component 212 newly-built another ones index stores parts.Further, the unnecessary full text registration/deletion index stores parts (no in next is handled) of memory unit management component 212 deletions.
In addition, in current the 11 embodiment, the small-sized full text registration index stores parts of describing with reference to accompanying drawing 19 209 are increased to small-sized full text registration index stores part B (209b) with the number of memory unit from only small-sized full text registration index stores components A (209a) in good time, C (209c), D (209d) ... (with random order), and in time delete these memory units.The small-sized full text deletion index stores parts of describing with reference to accompanying drawing 19 210 are increased to small-sized full text deletion index stores part B (210b) with the number of memory unit from only small-sized full text deletion index stores components A (210a) in good time, C (210c), D (210d) ... (with random order), and in time delete these memory units.
By using the small-sized full text registration index stores parts of increase/deletion in time, carrying out (or other location registration process) when registering the index stores parts to the processing of the pooled data of large-scale full-text search index stores parts 208 in full by one of them, location registration process parts 203 use another to register the index stores parts in full, so that carry out location registration process.On the other hand, by using the small-sized full text deletion index stores parts of increase/deletion in time, carrying out (or other deletion processing) when deleting the index stores parts to the processing of the pooled data of large-scale full-text search index stores parts 208 in full by one of them, deletion processing element 204 uses another to delete the index stores parts in full, handles so that carry out deletion.In addition, also be applicable to full-text search device at the hardware configuration shown in accompanying drawing 15 and the accompanying drawing 16 according to eleventh embodiment of the invention.In addition, in storer rather than in memory storage 125 or 153, provide one or more above-mentioned memory units, also be fine.
Below, with the example of detailed description according to the operation of the full-text search device of eleventh embodiment of the invention.
Accompanying drawing 32 to 34 is process flow diagrams of example of the processing procedure of the full-text search device of explanation shown in the accompanying drawing 31.
When the full-text search device is received processing request from the user (step ST121), at first, the full-text search device judges whether this processing is location registration process (step ST122), whether this processing is that (step ST123) handled in deletion, and whether this processings is that (denying among the step ST123) handled in search.The full-text search device is judged the following arbitrary processing of execution according to this.
" location registration process "
In the process of carrying out location registration process, at first, the user creates document data, and by the described document data of input block 201 registrations.Location registration process parts 203 are stored the document data in document data memory section 207, simultaneously, determine the identifier (document identifier) (the step ST131 in the accompanying drawing 33) of indication the document data.Further, utilize text segmentation parts 206, location registration process parts 203 acquisition partial character string (mark) and this are marked at the appearance positional information (step ST132) in the document data.In addition, this dividing method and full-text index and above-described identical.
According to the order of location registration process parts 203, or in time, the current available small-sized full text registration index stores parts (step ST133) that whether have are judged by memory unit management component 212.When not having available small-sized full text registration index stores parts at present (among the step ST133 not), memory unit management component 212 newly-built another small-sized full text registration index stores parts (for example, small-sized full text registration index stores parts C) (step ST135).When available small-sized full text registration index stores parts (among the step ST133 be at this moment, or after the step ST135), location registration process parts 203 are recorded in (for example, small-sized full text registration index stores components A (209a)/C) (step ST134/ST136) in the used at that time small-sized full text registration index stores parts with the appearance positional information of document identifier and this mark.
After the recording step of step ST134/ST136, merge to handle and just in time finish.In present example, merge to handle and carry out according to predetermined merging initial conditions.At first, as the result who writes down among the step ST134/ST136, location registration process parts 203 judge whether to satisfy merging initial conditions (step ST137).When satisfy not merging initial conditions (among the step ST137 not), this processing finishes.In addition, the initial conditions of described in the aforementioned embodiment above-mentioned merging processing form also are applicable to the 11 current embodiment.When satisfy merging initial conditions (being among the step ST137), location registration process parts 203 judge whether another small-sized full text registration index stores parts (in this example, being small-sized full text registration index stores part B (209b)/A (209a)) are merging processing (step ST138).In addition, in step ST138, location registration process parts 203 judge also whether small-sized full text registration index stores part B (209b)/A (209a) is merging processing.When small-sized full text registration index stores part B (209b)/A (209a) is merging when handling (being among the step ST138), processing to be combined such as location registration process parts 203 finishes.Explained that great majority are about merging the supposed situation of the judgement of handling here.
When satisfying the merging initial conditions, and another small-sized full text registration index stores part B (209b)/A (209a) does not carry out and merges when handling (denying among the step ST138), location registration process parts 203 are according to the small-sized full text registration index A/C among small-sized full text registration index stores components A (209a)/C, start the merging similar and handle (step ST140-ST142) to the step ST47-ST49 shown in the accompanying drawing 25, and the memory unit of the record that next location registration process will be carried out, be transformed into another small-sized full text registration index stores part B (209b)/A (209a) (step ST139) from small-sized full text registration index stores components A (209a)/C.When starting the merging processing, merging parts 211 are carried out asynchronously with location registration process parts 203 and are merged processing.In addition, memory unit management component 212 can handle or in time delete unnecessary full text registration index stores parts (no in next is handled) according to merging.
" deletion is handled "
In the process of carrying out the deletion processing, at first, the user imports the document identifier of the document that will delete by input block 201.Then, deletion processing element 204 reads the document data (the step ST151 in the accompanying drawing 34) corresponding to the document identifier from document data memory section 207.Further, deletion processing element 204 utilizes text segmentation parts 206 to obtain the appearance positional information (step ST152) of partial character string (mark) and this mark from document data.
Then, deletion processing element 204 judges whether the document identifier is registered in small-sized full text registration index (step ST153).When document identifier is registered, then register the appearance positional information (step ST155) of this mark of deletion the index stores parts from small-sized full text in small-sized full text registration index.When document identifier is not registered in small-sized full text registration index (, when in large-scale full-text search index, having registered the document identifier), carry out the recording process that following small-sized full text is deleted the index stores parts.
According to the order of deletion processing element 204, or in time, the current available small-sized full text deletion index stores parts (step ST154) that whether also exist are judged by memory unit management component 212.When not having available small-sized full text deletion index stores parts (among the step ST154 not), memory unit management component 212 newly-built another small-sized full text deletion index stores parts (for example, small-sized full text deletion index stores parts C) (step ST157).And when having available small-sized full text deletion index stores parts (among the step ST154 be, or after the step ST157), the used at that time small-sized full text of deletion processing element 204 (is for example deleted the index stores parts, in the small-sized full text deletion index stores components A (210a)/C), the appearance positional information (step ST156/ST158) of record document identifier and this mark.Subsequently, deletion processing element 204 is deleted the document data (step ST175) corresponding to the document identifier from document data memory section 207.
After the recording step of step ST156/ST158, merge to handle and just in time finish.In present example, carry out merging according to predetermined merging initial conditions and handle.At first, as the result at the record of step ST156/ST158, deletion processing element 204 judges whether to satisfy and merges initial conditions (step ST159).When satisfy not merging initial conditions (among the step ST159 not), after execution of step ST175, this processing finishes.In addition, as mentioned above, the initial conditions of described in the aforementioned embodiment above-mentioned merging processing form also are applicable to the 11 current embodiment.When satisfy merging initial conditions (being among the step ST159), deletion processing element 204 judges whether another small-sized full text deletion index stores parts (in this example, being small-sized full text deletion index stores part B (210b)/A (210a)) are merging processing (step ST170).In addition, in step ST170, deletion processing element 204 judges also whether small-sized full text deletion index stores part B (210b)/A (210a) is deleting processing.When small-sized full text deletion index stores part B (210b)/A (210a) is deleting when handling (being among the step ST170), processing to be combined such as this deletion processing element 204 finish.Explained that great majority are about merging the supposed situation of the judgement of handling here.
When satisfying the merging initial conditions, and another small-sized full text deletion index stores part B (210b)/A (210a) does not merge when handling (among the step ST170 not), deletion processing element 204 is according to the small-sized full text deletion index A/C among small-sized full text deletion index stores components A (210a)/C, start the merging similar and handle (step ST172-ST174), and the memory unit of next one deletion being handled the record that will carry out is transformed into another small-sized full text from small-sized full text deletion index stores components A (210a)/C and deletes index stores part B (209b)/A (210a) (step ST171) to the step ST59-ST61 shown in the accompanying drawing 26.When starting the merging processing, merge parts 211 and carry out the merging processing asynchronously with deletion processing element 204.In addition, memory unit management component 212 can or in time delete unnecessary full text deletion index stores parts (no in next is handled) in merging processing.
" search is handled "
The search of eleventh embodiment of the invention is handled basically to top similar with reference to accompanying drawing 24 described search processing, and, step ST34 in the accompanying drawing 24 to ST39 respectively corresponding to the step ST124 in the accompanying drawing 32 to ST129, just, among the step ST126, search processing element 205, the small-sized full text that utilizes current all small-sized full text to register in the index stores parts is registered index, obtains the document identifier (step ST126) that a group (Ri) comprises the document data of search string.Further, in step ST127, search processing element 205 is utilized the small-sized full text deletion index of all small-sized full text deletion index stores parts of current existence, obtains the document identifier (step ST127) that a group (Rd) comprises the document data of search string.Many groups document identifier (Rs of 205 pairs of acquisitions of search processing element, Ri, Rd) carry out following one group of operation, so that obtain Search Results (R) (step ST128), and, one group of document identifier that contains the document data of search string is exported to user (step ST129) as Search Results (R) by output block 202.
R=Rs+Ri-Rd,
Wherein ,+represent logic OR operational character ,-represent logic NOT operational character.
In addition, the present invention is not only applicable to above-mentioned full-text search device, is applicable to as the full-text search method in the described full-text search system of process flow diagram of the top example of handling with reference to the full-text search device yet.And, the present invention also is applicable to the program that is used for realizing the full-text search apparatus function, be used for realizing forming the functions of components of full-text search device program, be used for carrying out the full-text search method program, be used for carrying out the program of described treatment step, or have the computer readable recording medium storing program for performing of arbitrary program.
<embodiment 12 〉
Here will describe an embodiment, and wherein be used for realizing according to the functional programs of the full-text search of each previous embodiment and data storage in recording medium.Especially, recording medium can be CD-ROM, magneto-optic disk, and DVD-ROM, FD, flash memory, and various types of ROM and RAM, or the like.The described computing machine that makes is carried out function according to the full-text search device of each previous embodiment so that realize the full-text search functional programs, is recorded in the issuable recording medium, thereby helps the realization of described function.Function according to full-text search device of the present invention can realize by aforementioned recording medium is installed in the signal conditioning package of for example computing machine, and read this program by signal conditioning package, perhaps in the storage medium of signal conditioning package, store this program, and read this program when needed.
The present invention is not limited to the specific embodiment that disclosed, can change and revise in not deviating from scope of the present invention.
The application is based on once Japanese priority application: the NO.2002-165580 that on June 6th, 2002 submitted to, the NO.2002-214343 that the NO.2002-169487 that on June 11st, 2002 submitted to and on July 23rd, 2002 submit to, its content all merges here to be used.

Claims (5)

1. the data base management system (DBMS) of a management database, described system comprises:
First data that are used to search for are preserved parts, and it carries out the search operation of data and the alter operation that low speed is carried out data at a high speed;
Second data that are used to insert and delete are preserved parts, wherein each all low speed carry out the search operation of data and the alter operation of carrying out data at a high speed;
Data transfer unit, it is preserved parts with data from each described second data and is sent to described first data preservation parts, so that the result of operation or deletion action is inserted in reflection;
Database manipulation Request Processing parts, it carries out the operation requests to database;
Transaction component, it guarantees the consistance of the data between described data transfer unit and the described database manipulation Request Processing parts; And
The file conversion parts, it is converted to described first data preservation parts with the described second data holding member between the operation requests of database and asynchronous merging are handled, one of them of preserving parts with described second data of box lunch is used for asynchronous merging when handling, and another second data are preserved parts and are used for operation requests to database.
2. data base management system (DBMS) according to claim 1, it comprises three or more second data preservation parts; And
Whether the quantity that second data that data volume decision means, its judgement are used for the operation requests of database are preserved the data of parts has exceeded boundary value,
Wherein said file conversion parts judge that with described data volume decision means described second data that surpass described boundary value preserve the parts conversion and be used for merging and handle, and with another not second data of usefulness preserve parts and be converted to second data that are used for the operation requests of database and preserve parts.
3. data base management system (DBMS) according to claim 1, it comprises the data volume decision means, judgement is used for whether the quantity of the data of described second data preservation parts of the operation requests of database has been exceeded boundary value,
Wherein said document converting member judges that with described data volume decision means described second data preservation parts conversion that surpasses described boundary value is used for merging and handles, and dynamically newly-built second data preservation parts, so that being preserved parts, the described second newly-built data are converted to second data preservation parts that are used for the database manipulation request.
4. data base management system (DBMS) according to claim 1 comprises watchdog timer, and it monitors that one is provided with the process of time, and when having passed through described request release one lock when the time is set,
Wherein said database manipulation Request Processing parts start described watchdog timer when receiving searching request,
The common described merging of carrying out a plurality of indexing units of described data transfer unit is handled, and does not discharge this lock, up to having passed through the described time that is provided with, and
When through time of described setting, described data transfer unit discharges described lock for merging to handle, and passes control to described searching request.
5. data base management system (DBMS) according to claim 4, wherein in described watchdog timer, be provided with described be provided with the time be the search in the maximum ductile time.
CNB2005101268494A 2002-06-06 2003-06-06 Database management system Expired - Fee Related CN100495394C (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP165580/02 2002-06-06
JP2002165580A JP4289834B2 (en) 2002-06-06 2002-06-06 Database management system, database management program, and recording medium
JP169487/02 2002-06-11
JP214343/02 2002-07-23

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CNB031330142A Division CN1297933C (en) 2002-06-06 2003-06-06 Full-lext search device capable of excecuting merge treatment and logon/deletion treatment

Publications (2)

Publication Number Publication Date
CN1770162A true CN1770162A (en) 2006-05-10
CN100495394C CN100495394C (en) 2009-06-03

Family

ID=30433384

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005101268494A Expired - Fee Related CN100495394C (en) 2002-06-06 2003-06-06 Database management system

Country Status (2)

Country Link
JP (1) JP4289834B2 (en)
CN (1) CN100495394C (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678690A (en) * 2013-12-26 2014-03-26 Tcl集团股份有限公司 Transaction management method and device for Android system
CN107667364A (en) * 2015-06-04 2018-02-06 微软技术许可有限责任公司 Use the atomic update of hardware transaction memory control index

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787119B (en) * 2016-03-25 2020-06-16 盛趣信息技术(上海)有限公司 Big data processing method and system based on hybrid engine

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678690A (en) * 2013-12-26 2014-03-26 Tcl集团股份有限公司 Transaction management method and device for Android system
CN103678690B (en) * 2013-12-26 2018-01-12 Tcl集团股份有限公司 A kind of Andriod system transactions management method and device
CN107667364A (en) * 2015-06-04 2018-02-06 微软技术许可有限责任公司 Use the atomic update of hardware transaction memory control index
CN107667364B (en) * 2015-06-04 2022-04-26 微软技术许可有限责任公司 Atomic update using hardware transactional memory control indexes

Also Published As

Publication number Publication date
CN100495394C (en) 2009-06-03
JP2004013490A (en) 2004-01-15
JP4289834B2 (en) 2009-07-01

Similar Documents

Publication Publication Date Title
CN1297933C (en) Full-lext search device capable of excecuting merge treatment and logon/deletion treatment
CN1097795C (en) Document processing method and device and computer readable recording medium
CN1120425C (en) Memory controller and memory control system
CN1110757C (en) Methods and apparatuses for processing a bilingual database
CN1270270C (en) Proximity communication system, proximity communication method, data managing apparatus, data managing method, recording medium, and computer program
CN1524216A (en) System and method for software component plug-in framework
CN1702634A (en) Facilitating management of storage of a pageable mode virtual environment absent intervention of a host of the environment
CN1969292A (en) User profile management system
CN1299177C (en) Data management device, computer system and storage medium of storage program
CN1296811C (en) Information processor and its control method, computer readable medium
CN1592905A (en) System and method for automatically generating database queries
CN1126053C (en) Documents retrieval method and system
CN1922586A (en) Non-volatile memory and method with memory planes alignment
CN1427335A (en) Circuit set controlling system
CN1498367A (en) Information processing device, momery management device, memory management method and information processing method
CN1482568A (en) System for preventing unauthorized use of recording media
CN1292901A (en) Database apparatus
CN1181551A (en) Cluster control system
CN1527534A (en) Using virtual target object to prepare and serve non-serrer data transmission operating request
CN1678990A (en) Web services apparatus and methods
CN1432919A (en) Garbage collector and its collectrion method
CN1577324A (en) Document management method, document management program, recording medium, and document management apparatus
CN1573656A (en) Power supply management system in parallel processing system and power supply management program therefor
CN1447261A (en) Specific factor, generation of alphabetic string and device and method of similarity calculation
CN1761956A (en) Systems and methods for fragment-based serialization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090603

Termination date: 20180606

CF01 Termination of patent right due to non-payment of annual fee