CN101059811A - Document retrieving system, document retrieving apparatus, and method thereof - Google Patents

Document retrieving system, document retrieving apparatus, and method thereof Download PDF

Info

Publication number
CN101059811A
CN101059811A CN 200710088524 CN200710088524A CN101059811A CN 101059811 A CN101059811 A CN 101059811A CN 200710088524 CN200710088524 CN 200710088524 CN 200710088524 A CN200710088524 A CN 200710088524A CN 101059811 A CN101059811 A CN 101059811A
Authority
CN
China
Prior art keywords
document
search condition
index
unit
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200710088524
Other languages
Chinese (zh)
Other versions
CN100498791C (en
Inventor
佐藤正晃
福田慎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Publication of CN101059811A publication Critical patent/CN101059811A/en
Application granted granted Critical
Publication of CN100498791C publication Critical patent/CN100498791C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Each of two or more document management server stores a document and index data corresponding to the document. Of the index data stored in the two or more document management servers, index data corresponding to a document that satisfies a first retrieval condition is collected for storage. When a user requests retrieval based on a second retrieval condition, it is determined whether the first retrieval condition and the second retrieval condition are the same, and when the first and second conditions are the same, the document retrieval according to the user's instruction is performed by referring to the collected index data.

Description

DRS, document retrieving apparatus and method thereof
Technical field
The present invention relates to be used for DRS and the document retrieving apparatus and the method thereof of the document search file of in two or more document management servers that connect by network, registering.
Background technology
DRS is known as such system, promptly wherein two or more document storage in database such as archive server, and the user from the storage document the retrieval desirable document.In this system, for example when the registration document, extract the key word that is included in the document data so that produce index.Manage the index that produces by this way dividually with mode and the document of being associated with document.
When user's input is used for the key word of search file, determine whether the key word of input is included in the index.If index comprises this key word, will be defined as the document that to retrieve corresponding to the document of this index.Make index of reference improve the response time in the retrieving by this way.
As such DRS, ask two or more servers so that the mode of retrieval has proposed the realization (seeing the Jap.P. spy number of opening 2004-342042) of comprehensive file retrieval service with the device (for example, personal computer (PC)) of user's operation.In this system, two or more servers are shared the index be stored in the document in the server separately.Therefore, send the file retrieval order to any one server, can from the document in being stored in two or more servers, carry out retrieval by the user.
Yet above-mentioned traditional technology has following problem.That is, in the document of storage from two or more servers of connecting by network etc. during search file, if Servers-all is made retrieval request, possible spended time before receiving result for retrieval from Servers-all.In addition, because the mass data that will comprise about retrieval request or result for retrieval information for each search operaqtion sends to network, network will be in heavy duty.
Replacedly, if two or more servers are shared the index be stored in the document in the server separately, as disclosed among the above-mentioned Jap.P. spy number of the opening 2004-342042, it is just enough to make retrieval request in when retrieval to one in the described server, and this has improved recall precision.
Yet, in this case, require two or more server maintenances to be registered in the index of the document in the every other server respectively.Therefore, along with the increase of quantity of server or storage document, the indexed data amount that will safeguard has increased, and this has caused the waste of memory resource, and has prolonged the retrieval required time.In addition, be registered at document under the situation of certain server, its index is sent to every other server by network, and this has increased network traffics.
Summary of the invention
The objective of the invention is to solve the problems referred to above of conventional art.
Recall precision when the invention is characterized in search file the document that raising registers in two or more document management servers that connect by network.
According to an aspect of the present invention, provide a kind of DRS that is used for the document search file of registering in two or more document management servers that connect by network, the document searching system comprises:
Each storage unit that provides of document management server is provided, is used to store document and corresponding to the index data of document;
Collector unit is used for collecting index data corresponding to the document that satisfies first search condition from each the index data of memory cell storage of document management server;
Determining unit is used for determining whether second search condition is identical with first search condition, and second search condition is specified by the user; With
Retrieval unit is used for determining under second search condition situation identical with first search condition in determining unit, by the index data search file of reference by the collector unit collection.
According to an aspect of the present invention, provide a kind of document retrieving apparatus, it is used for search file from the document that two or more document management servers that are connected to document retrieving apparatus by network are registered, and the document indexing unit comprises:
Input block is used to receive search condition so that search file;
The log information management unit is used to manage the log information of the search condition that is received by input block;
Determining unit is used for based on the log information by the log information management Single Component Management, determines whether to exist the search condition that satisfies predetermined collection condition; With
Collector unit is used for determining to exist under the situation of the search condition that satisfies described predetermined collection condition in determining unit, collects the index data corresponding to the document that satisfies search condition in document retrieving apparatus.
In addition, according to an aspect of the present invention, provide a kind of document retrieval method that is used for search file from the document that two or more document management servers that connect by network are registered, the method comprising the steps of:
Document management server each memory cell storage document and corresponding to the index data of the document,
From storing step, be stored in the index data in the storage unit, collect index data corresponding to the document that satisfies first search condition,
Determine whether second search condition identical with first search condition, second search condition specify by the user and
In determining step, determine under second search condition situation identical, by carrying out retrieval with reference to collecting the index data of collecting in the step with first search condition.
According to an aspect of the present invention, a kind of document retrieval method that is used for document retrieving apparatus is provided, described document retrieving apparatus is used for search file from the document that two or more document management servers that are connected to document retrieving apparatus by network are registered, and the method comprising the steps of:
The input search condition is so that search file,
The log information of the search condition of importing in the management input step,
Based on the log information of managing in the log information management step, determine whether to exist the search condition that satisfies predetermined collection condition and
In determining step, determine to exist under the situation of the search condition that satisfies predetermined collection condition, in document retrieving apparatus, collect index data corresponding to the document that satisfies search condition.
With reference to the accompanying drawings, from following description, will understand other features of the present invention to example embodiment.
Description of drawings
The accompanying drawing that is combined in the instructions and constitutes the part of instructions shows embodiments of the invention, and with describe one and be used from and explain principle of the present invention.
Figure 1A and 1B describe the figure of the feature of the configuration of DRS according to an embodiment of the invention;
Fig. 2 shows the block scheme of the configuration of DRS according to an embodiment of the invention;
Fig. 3 shows the block scheme of the particular hardware configuration of management server and archive server according to an embodiment of the invention;
Figure 4 and 5 are process flow diagrams of describing the retrieval process on the management server according to an embodiment of the invention;
Fig. 6 describes the process flow diagram of registering the processing of new document data according to embodiments of the invention in DRS;
Fig. 7 shows the figure of the example key word input window of personal computer (PC) according to an embodiment of the invention;
Fig. 8 shows according to embodiments of the invention and is stored in the figure that example in the hiting data storer is hit table;
Fig. 9 A and 9B show the example figure of collection index table that stores collection index according to embodiments of the invention;
Figure 10 shows the figure that is used for the sample window of display document result for retrieval on PC according to embodiments of the invention;
Figure 11 is the exemplary plot that is used to manage the table of key word according to embodiments of the invention, for corresponding index collected in these key words;
Figure 12 shows the figure of the feature of the configuration of DRS according to an embodiment of the invention;
Figure 13 shows the block scheme of the configuration of DRS according to an embodiment of the invention;
Figure 14 is a process flow diagram of having described the retrieval process on the archive server according to an embodiment of the invention;
Figure 15 shows the figure of example retrieval statistics tables of data according to an embodiment of the invention;
Figure 16 shows according to the figure of embodiments of the invention by the example concordance list of document services management;
Figure 17 shows the figure that is sent to the example search condition of archive server according to embodiments of the invention;
Figure 18 shows the figure of the example result for retrieval display window of archive server according to an embodiment of the invention;
Figure 19 is a process flow diagram of having described the retrieval process on the archive server according to an embodiment of the invention;
Figure 20 has described the process flow diagram that is used for the establishment processing of the collection condition on the archive server according to embodiments of the invention;
Figure 21 shows according to embodiments of the invention, is sent to the figure of the example collection condition of archive server;
Figure 22 has described the process flow diagram of collecting the processing of the index on the archive server according to embodiments of the invention;
Figure 23 has described the process flow diagram of collecting the processing of the index on the archive server according to embodiments of the invention;
Figure 24 has described the process flow diagram of registering the processing of new file according to embodiments of the invention on archive server;
Figure 25 A and 25B are the figure that describes the document registration process on the archive server according to an embodiment of the invention;
Figure 26 describes the figure of the feature of the configuration of DRS according to an embodiment of the invention;
Figure 27 shows the block scheme of the configuration of DRS according to an embodiment of the invention;
Figure 28 is the process flow diagram of having described according to embodiments of the invention processing of collection index on archive server; With
Figure 29 A and 29B have described the document properties of archive server and the figure of ability information according to an embodiment of the invention.
Embodiment
Describe the preferred embodiments of the present invention in detail referring now to accompanying drawing.Should be noted that these embodiment do not limit the scope of the invention, and all combinations of the feature of describing among the embodiment realize not necessarily required for the present invention.
[embodiment 1]
Figure 1A and 1B are the figure of description according to the feature of the configuration of the DRS of the first embodiment of the present invention.
Figure 1A is a block scheme of describing the configuration of traditional documents searching system (multiserver searching system).Herein, the user operates PC search condition (key word) is sent to each server, so that make retrieval request.Each server stores document and with the document associated index.When each server receives retrieval request from PC, it will compare from search condition and the storage index within it that PC receives, and notify PC with result for retrieval.PC merges the result for retrieval by server notification, so that show the result of merging to the user.
If PC is a large amount of to its number of making the server of retrieval request, it may be time-consuming carrying out retrieval.Especially wait for from Servers-all at PC and receive result for retrieval and merge result for retrieval so that under the situation about showing, till PC received result for retrieval from Servers-all, result for retrieval can not be output demonstration.This has reduced recall precision.
On the other hand, Figure 1B is the block scheme of description according to the configuration of the DRS (multiserver searching system) of first embodiment.Herein, the user at first uses PC103 that search condition (key word) is sent to management server 101, and meanwhile this search condition is sent to server 105 to 107 so that the request file retrieval.Based on predetermined collection condition (for example, this search condition has been used pre-determined number at least in predetermined period), management server 101 determines whether the search condition of user's input is to have passed through the search condition that index is collected.
When determining that this search condition is collected through index, in certain server, collect index corresponding to this search condition.Collect in the operation at this, the index that is stored in the server 105 to 107 can be sent to certain server.Replacedly, can duplicate this index so that being stored in certain collects in the server of destination, and this index is stayed on the initial server.In addition, the information of the search condition of respective index has been collected in management server 101 management indications for it, and the information of this collection of respective index has been carried out in indication.
Afterwards, specify certain search condition and provide instruction so that carry out under the situation of retrieval the user, management server 101 determines whether it is managing the search condition of appointment.If management server 101 is being managed the search condition of user's appointment, mean the index of having collected corresponding to it herein.Therefore, management server 101 identifies wherein collects the server that this index is arranged, and to this server notification search condition, thereby the request retrieval.
In the server 105 to 107 one or management server 101 can be the servers of wherein collecting index.In first embodiment, such situation is described, that is, in server 105 to 107, storing maximum corresponding to an interior collection index of the document of collection index one of (among Fig. 1, in the server 105 to 107).
In this case, when certain key word being appointed as search condition at least during pre-determined number, the server of maximum documents by this key search is being stored in identification.Then, will be collected in this server (for example, server 106) by the index of the document of this key search.Therefore,, can wherein collect the server 106 that described index be arranged so that carry out retrieval, carry out retrieval the document in being stored in other servers 105 and 107 by request when when indication is retrieved with this key word arbitrarily the time subsequently.Therefore, improved recall precision.
In this case, need manage to it and collected the key word of respective index and wherein collected the server that this index is arranged.For this reason, in first embodiment, be designated as the key word of search condition for each, management server 101 is for example wherein collected the information that has corresponding to the server of the index of each key word with the form of the form shown in Figure 11 storage indication.In Figure 11, the index collection table of the key word shown in Figure 11 is stored in respectively in server 105,106 and 107.
Though management server 101 is illustrated as the server opened in 105 to 107 minutes with server in current embodiment, management server 101 can be combined in in the server 105 to 107 any one.Replacedly, management server 101 can be combined in the PC103.
In addition, be each document generation index on each server, and index comprise the information of indicating character string, document name, the document preparation date and time that is included in the document, the user who prepares document etc.The index of document itself represented in the key word of being appointed as search condition, or user's input is so that carry out the interior character string of the index that is included in document of retrieval.
Fig. 2 shows the block scheme according to the configuration of the DRS of the first embodiment of the present invention.
This system comprises management server 101, archive server (document storage device) 105 and the PC (personal computer) 103 by the Internet 104 interconnection.Though two or more archive servers except archive server 105 (for example 106,107) are connected to this system by the Internet 104, only indicate archive server 105 as typical example herein.Management server 101 can be combined in in two or more archive servers one.
In this configuration, by the software that is called as browser that operation provides among the PC103, the user can access management server 101 or archive server 105 so that obtain document data.Should be noted that in each archive server, store the document data main body and corresponding to its index.
Though the Internet 104 is used for the Connection Service device in first embodiment, the invention is not restricted to this.For example, can use LAN (LAN (Local Area Network)) or other network systems.
Management server 101 is to be used to be provided for realizing the comprehensively server of the document registration/retrieval service function of two or more archive servers.Certain address (URL) that is provided by management server 101 by the browser access that uses PC103 for example, the user can register document in each archive server, or watches, obtains, upgrades the document of archive server stored or therefrom carry out retrieval.
Management server 101 monitoring users are appointed as the key word of the search condition that is used for search file.Then, in the storage unit 110 data (for example above-mentioned collection condition) are set according to being stored in to be provided with, management server 101 extracts the frequent key word that uses, and judges whether to collect the index corresponding to these key words.When the index collected corresponding to certain key word, management server 101 is collected index corresponding to this key word from archive server, and at certain server (or management server 101 in the archive server) this index of stored.In this case, by store maximum corresponding to the archive server of the document of collection index in collection index, the retrieval that can more effectively describe below.
Then, when the user utilizes the key word of PC103 input identical with the key word of having collected index for it, to store this archive server of collection index make retrieval request.
When the user registers new document after having collected index, determine whether the index that produces from the document collecting.If will collect this index, with document body (document data) and its index stores in the archive server of the collection destination of this index.
On the other hand, archive server 105 storage document body and index, and provide such as register, watch, obtain, the function of renewal and search file.The service that the user utilizes archive server 105 to provide, so that by the Internet 104 visit presumptive addresss (being generally URL), thereby visit is stored in the document in the archive server 105.
When in two or more archive servers each receives the retrieval request of utilizing the key word that is designated as search condition, with reference to storage index within it, and retrieve document, and result for retrieval is sent to PC103 or management server 101 corresponding to this search condition.
When archive server 105 received from the document of PC103 registration request, the function of registration the document on the document registration unit 121 that provides within it was provided for it.
The management server 101 of first embodiment then, is described.
The various data that are provided with of storage unit 110 storages are set.Data are set to be comprised and is used for determining the information of whether collecting described index for aforesaid collection corresponding to the index of certain key word.In other words, when obtaining the frequency of utilization of key word, be provided with data definition monitor the fate of key word and have what the key word grade under the classification of the key word of frequent use should accept index and collect.Replacedly, key word grade under the classification that data define the key word that how much has maximum search file number and/or maximum retrieval number being set should accept index and collect.Further, data are set and determine whether the frequency that is used in combination and number of documents of retrieving and/or retrieval number.In addition, data being set comprises identification and wherein stores the information of the archive server 105 of collection index.
In addition, be provided with data determine for the registration of document corresponding to have what the key word grade under the classification of the key word of frequent use accept index relatively.In addition, when the index of the document of registering is accepted to collect when registering, data are set comprise about whether collecting the information of registering document body on the server of destination.Should be noted that and this to be set arbitrarily data are set by the provider that serves of manage server 101.
Based on the key word of being appointed as search condition, the use number of index number (that is, hitting number) that hiting data storer 111 storage sends from archive server 105 and the key word that is associated with archive server 105 to 107 is as hitting the table (see figure 8).Based on being stored under the condition that data are set that is provided with in the storage unit 110, index manager 112 relatively is stored in the hiting data of the key word in the hiting data storer 111, so that determine to accept the key word that index is collected.Then, index manager 112 obtains index corresponding to this key word from archive server, and described index is stored in the storage unit 118 associatedly.By this way, produce by collecting the (see figure 9) of collection index that (duplicating) frequent index that uses obtains.The collection index that obtains is like this sent to the archive server of storing maximum documents by this key search, so that be stored in wherein.
The document of registration is indicated in 113 registrations of document register by the user.Key word checks that unit 115 obtains the key word of user's appointments, and about the hits purpose information of the index that sends from archive server 105, and in this key word of storage unit 118 stored and information.When the registration document, index extraction apparatus 116 produces the index of document.When the registration document, Discr. 114 identifications archive server in registration destination is collected the index that is extracted by index extraction apparatus 116 from two or more archive servers in the document server.Then, based on by storage unit 110 information of managing are set, register destination Discr. 114 and determine whether document body to be registered in the collection destination archive server of this index.When this registration of needs, the collection destination archive server of this index is chosen as the registration destination of document.The whole operation of controller 117 control and management servers 101.With RAM (202 among Fig. 3), HDD configuration memory cells 118 such as (209 among Fig. 3), and storage unit 118 is stored above-mentioned various tables and data under the control of controller 117.Network interface 119 controls are by the communication of the Internet 104 or LAN.
Archive server 105 is then described.
Retrieval unit 120 when receive from PC103 as the key word of search condition the time, carry out retrieval the index in being registered in document registration unit 121, so that extract the index that comprises this key word.Document registration unit 121 when the document that is instructed to register from PC103, the document body and the index that from the document, extract in storage unit 123 stored.The whole operation of controller 122 control archive servers 105.With RAM (202 among Fig. 3), HDD configuration memory cells 123 such as (209 among Fig. 3), and storage unit 123 is stored the various data such as above-mentioned table under the control of controller 122.Network interface 124 controls are by the communication of the Internet 104 or LAN.
PC103 as described below, as to use with signal conditioning package (computing machine) configuration management server 101 that comprises CPU, ROM, RAM, HDD etc. and archive server 105 and user.In addition, these servers provide function by for example Web service.
Fig. 3 shows the block scheme according to the particular hardware configuration of the management server 101 of present embodiment and archive server 105.Above-mentioned PC103 also has identical hardware configuration.
In Fig. 3, CPU201 is according to the program that is stored in program ROM 203 and the RAM202, the entire process of Control Server.RAM202 be used from CPU201 primary memory, be used for executive routine the zone, be used for the effect of program implementation zone and data area.Program ROM 203 is ROM (read-only memory) of the running program of storage CPU201.Program ROM 203 comprises storage as being used for the program ROM of the basic software (OS) of the system program of Control Server equipment, and the data ROM of the required information of storage operating system etc.System program can be installed in HDD209 and go up (back descriptions), rather than on the ROM203, and with the program RAM202 that packs into, so that execution when it is carried out.Network interface (NETIF) 204 controls transmit by the data of the Internet 104, LAN etc., and the assessment connection status.Video-ram 205 storages are used for the video data of display unit 206.Display unit 206 is the display devices such as LCD or CRT monitor.Keyboard controller (KBC) 207 will output to bus 200 by the signal of keyboard 208 or pointing apparatus input.HDD209 is a hard disk drive, is used for application storing or various data (also playing above-mentioned information memory cell 118 and 123).FDD210 control goes out for the data write and read of removable dish (storage medium) 213 (such as floppy (registered trademark) disk drive and CD-ROM drive).The example of storage medium 213 comprises FD or outside hard disk, optical storage medium (for example, CD-ROM), magnetic-optical storage medium (for example, MO), semiconductor storage medium (for example, storage card) or other removable data memory devices (movable equipment).Can also utilize the application program or the data that are stored on the HDD209, it is stored on the FDD210.Printer controller (PRTC) 211 is used for the output signal of control to printer (PTR) 212.Printer 212 is the printing equipments such as laser beam printer (LBP).Bus 200 is the transmission bus (address bus, data bus, input/output bus and control bus) that are used to connect said units.
Be provided with storage unit 110, hiting data storer 111, index manager 112, document register 113, registration destination Discr. 114, the key word that should be noted that the management server 101 shown in Fig. 2 watch unit 115, index extraction apparatus 116 etc. to be realized by CPU201 and RAM202, HDD209 and program.The retrieval unit 120 of archive server 105, document registration unit 121 etc. can be by realizations such as CPU201, HDD209 and programs.
Figure 4 and 5 are process flow diagrams of describing according to the retrieval process on the management server 101 of first embodiment.The procedure stores that is used for carrying out this processing is at ROM203 or RAM202, and carries out under the control of CPU201.
In step S1, receive the key word that the user imports the search instruction among the PC103 and is used as search condition.
The figure of the example key word input window that Fig. 7 shows when showing the key word input PC103 that will be used for retrieving as the user.
Fig. 7 shows key word " engineering A " is input to state in the dialog box 701 that is used to import key word.When under this state, specifying " execution " button 702, the key word (" engineering A ") of search instruction and input is sent to management server 101 from PC103.
This means that retrieval comprises the document of character string " engineering A " in its main body or document title.702 indications of " execution " button begin retrieval.
Then, program enters step S2, determines whether to have collected index corresponding to the key word that receives in this step in step S1.Herein, by reference example table as shown in Figure 11, the server corresponding to the index of this key word has wherein been collected in identification, and to this server requests retrieval.That is, make inquiry to index manager 112, thereby determine whether to have collected index corresponding to this key word based on the key word that in step S1, obtains.
By this way, if definite index of having collected corresponding to this key word, program enters step S3 from step S2, upgrades the hiting data that hits table of hiting data storer 111 stored in this step.
By this way, can discern the archive server of wherein having collected corresponding to the document of the key word of input of index.Thereby, carry out retrieval by the server of request identification, also can from the document in being stored in other archive servers, carry out retrieval, this has improved recall precision.
Fig. 8 shows according to first embodiment and is stored in the figure that hiting data storer 111 interior examples are hit table.
In the example of Fig. 8, for in key word " engineering A ", " Tokyo " and " camera " each, the title (title of archive server), number, record date, the update date of the search file in each server of server of the document of number, memory scan used in registration.When using correspondent keyword, will use number to increase by 1 at every turn.Therefore, can discern this key word and use how many times.The number of search file is that indication is as the information that has gone out how many documents with the result retrieval of retrieving the index of this key word in being stored in each server.
Therefore, in step S3, if key word for example is " engineering A ", for each archive server of storing corresponding to the document of this key word, the use number of this key word is increased 1.
Fig. 9 A and 9B show and store the figure of the table of collection index of the example of first embodiment of collection index, have wherein collected the index corresponding to key word (" engineering A ").This collection index table for example be stored in the storage unit 108 of archive server 105.
Fig. 9 A shows the state of document storage in each archive server that comprises corresponding to the index of key word " engineering A ".That is, archive server 105 is being stored " document 1 " and " document 2 " as the document that comprises the character string of key word " engineering A ".In archive server 106, storing the document that 4 documents conducts that comprise " document 3 " to " document 6 " comprise the character string of key word " engineering A ".In archive server 107, storing the document that 3 documents conducts that comprise " document 7 " to " document 9 " comprise the character string of key word " engineering A ".
Fig. 9 B shows corresponding to the index of key word " engineering A " and concentrates the state that is registered in the archive server 106 as collection index.
In this example, " document 1 " that be stored in the archive server 105 to 107 by collection arrives the index of the index conduct of " document 9 " corresponding to key word " engineering A ", produces collection index.Then, collection index is registered in is storing at most corresponding to this in the archive server 106 of the document of collection index.
At this moment, in the table shown in Figure 11, the storage purpose ground of the table of collection index of key word " engineering A " is " archive server 106 ".
Return Fig. 4, after table was hit in renewal in step S3, program entered step S4.Identify and wherein registering the archive server of collection index.In the example of above-mentioned Fig. 9, this is corresponding to archive server 106.In step S4, retrieval request is sent to archive server 106 by transmitting key word.In step S5, PC103 waits for the result for retrieval from archive server 106, and when receiving result for retrieval, program enters step S6, shows the result who obtains in this step.Therefore, the user of PC103 can understand the archive server of wherein registering desirable document based on result for retrieval, and obtains desirable document from the document server.
Figure 10 shows the figure of the sample window of display document result for retrieval.
In Figure 10, size and the update date and the time of with the document title of key word (" engineering A ") retrieval, wherein registering the archive server of the document, described document have been listed.Can also show the information (who has prepared the document etc.) except above-mentioned document information.
When during corresponding to the index of the key word that is used to retrieve,, having improved recall precision owing to only need visit certain server as the registration of collection index.
On the other hand, if registration is not corresponding to the index of key word in collection index table in step S2, program enters step S11 (Fig. 5), so that determine whether this key word is registered in hitting in the table of hiting data storer 111.If registered this key word, program enters step S12, and upgrades and hit table accordingly.If unregistered this key word, program enters step S13, in this step key word is registered in and hits in the table.After execution in step S12 or step S13 by this way, program enters step S14.In step S14, the archive server with the retrieval of this key word is accepted in identification, and key word is sent to the document server, so that the request retrieval.In step S16, check and whether received result for retrieval from all archive servers of accepting retrieval.If received, program enters step S17, at the result for retrieval of this step merging from these archive servers.The processing that step S14 carries out in the S17 is similar to those processing (Figure 1A) of traditional multiserver retrieval process.
Then, in step S18,, determine whether to exist to satisfy to be stored in any key word that data (collection condition) are set that is provided with in the storage unit 110 based on the table that hits that upgrades by the processing in the step S12.Herein, data (collection condition) are set for example comprise from final updating whether having passed through a week (week is the update cycle) since the date of storage unit 110 is set, or do not selected index corresponding to key word etc. with first to the 3rd frequency of utilization.These configuration information parts can be used alone or in combination, so that determine to be registered in the interior index of collection index table.
For example, in Fig. 8, suppose that current date is on November 7th, 2005.The data that are provided with of supposing to be provided with storage unit 110 make that the key word that has any one in first, second or the 3rd frequency of utilization and be updated in a week is carried out index to be collected.In this case, collect because the key word that has been updated in a week is carried out index, the date of the renewal of being discussed is on November 1st, 2005 and afterwards.Therefore, in this case, only key word " engineering A " is carried out index and collect.
When determining not have the index of new registration in step S18, processing stops and does not further operate.If when determining to exist any index that will newly register, program enters step S19, and ask each archive server to carry out retrieval with this key word.Then, each archive server comprises the index of the document of this key word with retrieval unit 120 retrievals.Then, based on result for retrieval, but obtain collection index group (step S20) from each archive server.In this case, the collection index table of index manager 112 generations shown in Fig. 9 B.Then, step enters step S21, collection index table is sent to is storing at most corresponding to the archive server of the document of collection index (in the above example, archive server 106), and handle and stop.Table shown in also corresponding renewal Figure 11.
The processing of registration document data then, is described.
Fig. 6 describes the process flow diagram of registering the processing of document data according to first embodiment on DRS.
In step S31, the document data that input will be registered.In step S32,, extract its index by index extraction apparatus 116 based on the document data.Then, in step S33, determine whether the index that extracts has been registered as above-mentioned collection index.When registering, program enters step S34, and determines whether storing the interior registration of the archive server of collection index (in above-mentioned example, archive server 106) document body (document data).Make this and determine based on being stored in the data that are provided with that are provided with in the storage unit 110.In this case, if be arranged so that document body is registered in the archive server of storing collection index, program enters step S35, in this step document data is registered on the document server.
As a result, wherein registering in the archive server that frequently is used to the key word retrieved, will improve user's operability because document data is registered in.
On the other hand, in step S33, if the index that extracts is not as the above-mentioned registration of collection index, or in step S34, be arranged so that document data is registered in the archive server except the server of storing collection index, then program enters step S36, and document data is registered in the archive server by user's appointment.
Though described the situation of carrying out retrieval the document in being stored in archive server, document can be text data or view data, such as data bitmap.Can also apply the present invention to never to comprise the situation of carrying out retrieval in the image of character string information.
[embodiment 2]
Then, the second embodiment of the present invention is described.In first embodiment, in the archive server of storing maximum respective document, collect the index that is stored in the document in two or more archive servers 105 to 107, and PC103 makes retrieval request to collecting the destination archive server.
On the contrary, in a second embodiment, index is collected the device of request retrieval (in first embodiment, PC103).Therefore, when search condition be appointed as in the key word that will be collected respective index, this device can be carried out the document in being stored in each archive server and retrieve by with reference to the index that is collected in its own device (PC103).That is, owing to do not need further to improve recall precision by the Internet request retrieval.
Figure 12 shows the figure of feature of the configuration of DRS according to a second embodiment of the present invention.
The DRS of second embodiment comprise with first embodiment in the archive server described 105 to 107 similar and can store the archive server 1201 to 1204 of the index of document and document.Archive server 1201 has in the mode that is similar to PC103 and receives from user's search instruction with as the input of the key word of search condition and the function that request archive server 1202-1204 carries out retrieval.
Except archive server 1201 to 1204, the DRS of second embodiment can comprise other archive servers.
As search condition, carry out retrieval by the input key word in the document in any one that the user of archive server 1201 can be from be stored in archive server 1201 or archive server 1202 to 1204.The archive server 1201 that receives the key word of user's input is sent to archive server 1202 to 1204 with the key word of importing, and asks their to carry out retrieval.The archive server 1202 to 1204 that has received this key word so that retrieval comprises the document of the key word of receiving, and is notified the archive server 1201 of making retrieval request with result for retrieval respectively with reference to the index in the server that is stored in them.
At this moment, because the index of document and document also is stored in the archive server 1201, archive server 1201 also by with reference to storage index within it, is carried out the document in being stored in its own equipment and is retrieved.
Herein, the document of archive server 1201 in being stored in archive server 1201 to 1204, carry out under the situation of retrieval, compare when carrying out retrieval and receiving result for retrieval with request archive server 1202 to 1204, archive server 1201 can be finished retrieval with shorter time when carrying out retrieval the document of one's own server memory storage from it.Especially, carry out under the situation of retrieval by network requests archive server 1202 to 1204 at archive server 1201, archive server 1201 needs to wait for, till it receives result for retrieval from all archive servers.
On the other hand, when the document of archive server 1201 in being stored in its one's own equipment, carrying out retrieval,, can finish retrieval with shorter time owing to do not need to carry out exchanges data by network.Therefore, in a second embodiment, the index that satisfies the document of some condition is collected in advance and may be operated in the archive server of retrieving by the user.
For example, in the example described in Figure 12, also will be stored in the index F of the document F in the archive server 1203 and the duplicate of the index H that is stored in the document H in the archive server 1204 and be stored in the archive server 1201.As a result, if the user indicates the retrieval in archive server 1201 next time, because the index stores of document F and H can obtain result for retrieval with the short period in archive server 1201.
Figure 13 shows the figure according to the configuration of the DRS of second embodiment.This system comprises two or more archive servers 1201 to 1204 that connect by the Internet 1300.Though the Internet 1300 is used for the Connection Service device, the present invention is not confined to this especially.For example, can use LAN or other network systems.
Archive server 1201 to 1204 is stored document body and document index respectively, and provide such as register, watch, obtain, the function of renewal and search file.The document of registering in the search function searching system that the user uses archive server 1201 to 1204 to provide.After receiving the retrieval request of being undertaken by nominal key, archive server 1201 to 1204 judges that respectively whether this key word is corresponding to the index by this server admin.If key word is corresponding to index, server is with this result notification user.When the registration document, archive server 1201 to 1204 provides the function of registration document.
The configuration of archive server 1201 then, is described.The configuration of archive server 1202 to 1204 can be similar to the configuration of archive server 1201, maybe can be similar to the configuration of the archive server 105 to 107 of first embodiment.
Display unit 1320 is such as the display device that is provided at the LCD display in the archive server 1201.By the window shown in the displayed map 7 on display unit 1320, the search key that search condition input block 1321 receives by user's input.Retrieval unit 1322 by carrying out and retrieve with reference to being stored in index in its own equipment, and offers the user with result for retrieval when receiving the retrieval request of utilizing the key word be appointed as search condition from the user.Retrieval unit 1322 is carried out retrieval in a similar manner, and the result is offered the archive server of having asked retrieval when the retrieval request received from other archive server.
On display unit 1320, show result for retrieval, thereby notify the user it.Index manager 1323 managed storage all index in archive server 1201.With document body and index stores in storage unit 1311.In addition, the index that are stored in the archive server 1201 comprise corresponding to the index that is stored in the document body in its own equipment, and main body is stored in the index of the interior document of other archive server.
The search key that search condition delivery unit 1324 will be imported in search condition input block 1321 is sent to other archive server.On the other hand, search condition receiver 1325 receives the search key that transmits from from other archive server.Result for retrieval delivery unit 1326 is sent to other archive server with the result of the retrieval that retrieval unit 1322 is carried out.On the other hand, result for retrieval receiver 1327 receives the result for retrieval that transmits from other archive server, and shows it on display unit 1320.Statistics storer 1328 storage is about the statistics of the key word of input in search condition input block 1321 etc.
Based on the statistics that is stored in the statistics storer 1328, collection condition creator 1329 extracts the condition of the index collection that is used in the archive server 1201.The collection condition of the Shi Yonging information etc. of being collected the destination archive server by key word in the index that is included in each document and indication constitutes herein.The collection condition that collection condition delivery unit 1330 is created collection condition creator 1329 is sent to other archive server.Collection condition receiver 1331 receives the collection condition that transmits from other archive server.
Collection condition that collection condition storage unit 1332 is created by collection condition creator 1329 in storage unit 1311 stored and the collection condition that receives from other archive server.Based on the collection condition that is stored in the collection condition storage unit 1332, collection condition Discr. 1333 determines to drop on the index under the collection condition from the index of index manager 1323 management.
Index forwarder 1334 will be sent to the archive server of appointment by the index that collection condition Discr. 1333 is defined as being collected.Index receiver 1335 receives the index that transmits from other archive servers, and it is stored in the storage unit 1311.
Document registration unit 1336 is registered new document in archive server 1201.The main body (and respective index) of the document that will register from external device (ED) (not shown) input by network interface 1310.At this moment, when the document that will register was collected condition Discr. 1333 and is defined as accepting index and collects, document registration unit 1336 utilized index forwarder 1334 will be sent to the archive server of appointment corresponding to the index of the document.
The hardware configuration of the hardware configuration of archive server 1201 to 1204 and the management server shown in Fig. 3 101 and archive server 105 is similar.
Figure 14 and 19 is process flow diagrams of describing according to the retrieval process on the second embodiment archive server.The procedure stores that is used to carry out this processing and is carried out under the control of CPU201 in ROM203 or RAM202.
Figure 14 describe the user operate on it in case carry out the retrieval archive server (in this case, archive server 1201) on processing.In step S41, receive the search instruction of user's input and the key word that is used to retrieve.The sample window of input search key is similar to the window shown in Fig. 7.Then, program enters step S42, and the key word of importing in step S1401 in this step adds the information in the statistics storer 1328 that is stored in to so that upgrade.
Statistics storer 1328 is the tableau format shown in Figure 15.In Figure 15, for the key word that is used to retrieve, to retrieval number counting.That is, when search condition being appointed as in predetermined key word, will retrieve number increases by 1 at every turn, thus counting retrieval number.
Then, program enters step S43, determines whether that at this step retrieval unit 1322 index that will be included in the character string of the key word that step S41 receives is registered in its oneself the equipment.In this case, determine whether to register the index of the character string of the key word that is included in step S41 reception by the table shown in reference Figure 16.By the table shown in index manager 1323 management Figure 16.Index manager 1323 is upgraded the information of this table often according to the interpolation/deletion of index.
In the example shown in Figure 16, the storage of index title, search key and documents location with being relative to each other connection.Watch the documents location hurdle, comprise the index of indicating except the archive server outside the archive server (in this case, archive server 1201) of carrying out retrieval.This points out that the index that is registered in the document in other archive servers is duplicated in the archive server 1201 of carrying out retrieval.
Program enters step S44, at this step search condition delivery unit 1324 search condition is sent to other archive servers (in this case, archive server 1202 to 1204).
Figure 17 shows the figure of the example search condition that transmits in step S44.In Figure 17, specify search key and will send the archive server of result for retrieval to it.
Then, program enters step S45, receives result for retrieval at this step result for retrieval receiver 1327 from other archive servers.From receiving under the situation of result for retrieval to its all archive servers that transmitted search condition among step S44, a series of processing stop.In the result for retrieval that comprises from the result for retrieval that other archive servers receive, those result for retrieval that retrieved by described retrieval are presented on the display unit 1320 frequently, and need not wait for and receive all result for retrieval.
Figure 18 shows the figure of the example result for retrieval demonstration of archive server (archive server 1201), and the user operates the document server and carries out retrieval.
Figure 18 illustrates when " orange " result for retrieval when being appointed as search key.If be stored in the archive server 1201 index as shown in Figure 16, index A and index C comprise " orange " as key word, and the result retrieval of therefore carrying out retrieval as archive server 1201 from storage document is within it come out.Result retrieval that Reference numeral 1801 representative showed in the short relatively time, from storage document within it, carry out retrieval as archive server 1201 to result for retrieval.Reference numeral 1802 representatives are from the result for retrieval of other archive servers 1202,1203 and 1204 retrievals.In this case, owing to receive result for retrieval from other archive servers, compare with result for retrieval 1801 and before display result, want spended time usually by the Internet 1300.
Figure 19 is described among the step S44 of Figure 14 the process flow diagram that has transmitted the processing operation on the archive server (in this embodiment, archive server 1202 to 1204) of the search condition shown in Figure 17 to it.
In step S51, search condition receiver 1325 receives search condition.Then, program enters step S52, whether registers any index that is included in the keyword strings that receives among the step S51 in this step retrieval unit 1322 is determined its oneself equipment.Herein, carry out retrieval to be similar to the method for using among the step S43, that is, by be stored in the table shown in the Figure 16 in each archive server by retrieval unit 1322 references, retrieval comprises the index execution retrieval of the key word that receives as search key in step S51.Program enters step S53, will be sent to the archive server of appointment at the result for retrieval that step S52 obtains at this step result for retrieval delivery unit 1326.At this moment, even the index number that retrieves is " 0 ", result for retrieval delivery unit 1326 also transmits result for retrieval.
Then, be that example is described the index collection and treatment with archive server 1201.
Figure 20,22 and 23 is the process flow diagrams that are described in the index collection and treatment of carrying out on the archive server 1201.
At first, in step S61, collection condition creator 1329 reference statistical data-carrier stores 1328, so as to extract frequently be appointed as be used for the search condition retrieved from its oneself equipment key word as collection condition.Herein, obtain to comprise the index of the key word that extracts as collection condition from other archive servers (archive server 1202 to 1204), and in archive server 1201, create its duplicate, thereby in archive server 1201, collect the index that comprises these key words.
The collection condition of creating by this way is stored in the collection condition storage unit 1332.Can regularly or operate triggering by the user and create collection condition aperiodically.For example, can trigger by the renewal that is stored in the statistics in the statistics storer 1328 among the step S42 and create collection condition.
Then, program enters step S62, has determined whether to upgrade the collection condition that is stored in the collection condition storage unit 1332 in this step.When upgrading, program enters step S63, and collection condition delivery unit 1330 is sent to other archive servers (in this embodiment, archive server 1202 to 1204) with the collection condition that upgrades.
Figure 21 shows the figure of the example collection condition that transmits in the step S63 of Figure 20.
This collection condition is included in the key word that extracts among the step S61, and the information of indicating the collection destination server of this key word.At this moment, archive server 1201 is hung up a series of processing, and rests on armed state, till it receives the index that transmits from other servers.
Figure 22 is a process flow diagram of describing the processing operation on the archive server of receiving the collection condition that transmits in the step S63 among Figure 20.
Originally, collection condition receiver 1331 receives collection condition in step S71.Program enters step S72, and the collection condition that will receive in step S71 in this step is stored in the collection condition storage unit 1332.
Then, program enters step S73, determines whether to register any index corresponding to the collection condition that receives at this collection step condition Discr. 1333 in its oneself equipment in step S71.Herein, carry out and determine, that is, with reference to the table shown in Figure 16, determined whether to register any index that comprises the keyword strings that receives among the step S71 and carry out definite by collection condition Discr. 1333 to be similar to the method used among the step S43.
Then, program enters step S74, and if in step S73, retrieve the index that will collect, program enters step S75.In step S75, the index that index forwarder 1334 will retrieve in step S73 is sent to specified server.At this moment, keep the index be registered in its own equipment within it, and its duplicate is sent to transmission destination archive server, thereby duplicate this index.If in step S73, retrieve two or more index, the duplicate of all index is sent to the archive server of corresponding appointment.
Figure 23 is the process flow diagram of the operation on the archive server when describing from other archive servers transmission index.
In step S81, index receiver 1335 receives the index that transmits.Program enters step S82, and the index stores that will receive in step S81 in this step index manager 1323 is in storage unit 1311.Then, stop a series of processing.
By above-mentioned processing, in archive server, collect the index of the document of the search key that comprises frequent use.For example, under the situation as shown in figure 16 of the index in being stored in archive server 1201, index C and D are stored in the archive server 1201 as the result who duplicates from archive server 1203 and 1204 respectively.
Be the processing that example is described the registration document with archive server 1201 below.
Figure 24 is the process flow diagram that is described in the processing of registration document on the archive server.As mentioned above, the main body (and respective index) of the document that herein will register from external device (ED) (not shown) input by network interface 1310.As described in first embodiment, can in archive server, extract index (step S31) corresponding to the main body of the document of registering.
At first, in step S91, the document body and the respective index that are registered in the document registration unit 1336 are stored in the storage unit 1311.Program enters step S92 then, determines whether drop under the collection condition of collection condition storage unit 1332 stored at the index of step S91 storage at this collection step condition Discr. 1333.If the index of storing in step S91 drops under the collection condition, program enters step S93.In step S93, the index forwarder will drop on index under the collection condition and be sent to archive server by this collection condition appointment.When index drops on two or more collection conditions following times, in step S93, index is sent to collection destination archive server by all collection condition appointments.
The example process of registration document is described below with reference to Figure 25 A and 25B.
Figure 25 A shows the figure corresponding to the index of the document that will register.Figure 25 B shows and is stored in the document and is registered collection condition in the collection condition storage unit 1332 of archive server (in this case, archive server 1201) within it.Shown in Figure 25 A, the document of registration comprises key word, such as " panda " and " elephant ".In Figure 25 B,, index is sent to archive server 1203 and 1204 owing in archive server 1203 and 1204, collect key word respectively.
With above-mentioned processing, can also be the collection and treatment of new document execution index of registering.When the archive server of the index that the step S93 that has received in Figure 24 transmits duplicates index in its oneself equipment, to duplicate with processing execution identical described in Figure 23 is this.
By this way, operate on it so that carry out in the archive server of retrieval, collect index in advance corresponding to the key word that frequently is designated as search condition the user.As a result, relate to reference to the retrieval of collection index owing to can carry out in the equipment of the document server, retrieval becomes possibility fast.
[embodiment 3]
Then, the third embodiment of the present invention is described.In a second embodiment, in may being operated, duplicate the index of the search key that comprises frequent use in advance with the archive server of retrieving.On the contrary, in the 3rd embodiment, suppose document display device, printing equipment etc. are used as archive server, and index is duplicated in certain archive server based on the characteristic of the document of registering.
Suppose in archive server, to show or print the document of retrieval.In the 3rd embodiment, be similar to second embodiment, the user operates archive server searched targets document.Carry out retrieval from the archive server of operation, and the document server is sent to other archive servers with search condition simultaneously, thus the request retrieval.If desirable document is registered in the archive server of user operation since in its oneself the equipment search file, can retrieve fast.
Figure 26 shows the concept map of the DRS of a third embodiment in accordance with the invention.In Figure 26, suppose that each archive server for example is an image display device.In addition, about ability, suppose to exist archive server that can color display and only can show the archive server of monochrome image as the image display device of archive server.
In the document searching system, coloured image and monochrome image are being registered as under the situation of document, suppose from carrying out colored archive server retrieval coloured image and the demonstration that shows.For this reason, by in can carrying out the colored archive server that shows, collecting (duplicating) index in advance, the collection destination archive server retrieval color document that can show the short time from carrying out colour corresponding to color document.
In Figure 26, archive server 2601 is to carry out the colored archive server that shows, and archive server 2602 to 2604 is only to carry out the monochromatic archive server that shows.In Figure 26, in archive server 2601, duplicate and store index F and H corresponding to color document.That is, in archive server 2601, collect index corresponding to color document.Therefore, when from archive server 2601 retrieval color documents,, can the short time retrieve color document owing in its oneself equipment, retrieve color document.
As mentioned above, in the 3rd embodiment, determine the collection destination archive server of index according to the characteristic of the document of registering.In the 3rd embodiment, describing archive server is the situation of image display device, but archive server is not limited thereto.For example, when archive server was printing equipment, such configuration was feasible, but wherein will be corresponding to the index copy of the data that are made of a large amount of page or leaf on can the archive server or a large amount of remaining archive servers of consumables of flying print.To also be feasible corresponding to the configuration of index copy in having the archive server of high print resolution of graphic documentation.
Figure 27 shows the figure according to the configuration of the DRS of the 3rd embodiment.The configuration of in second embodiment, describing, document properties Discr. 2701 and archive server ability storage unit 2702 have been increased.In addition, configuration identical with shown in Figure 13.Document properties Discr. 2701 determines that the document of registration is coloured image or monochrome image.The ability information of the display unit 1320 of archive server ability storage unit 2702 each archive servers of storage.Because except the processing of registration document, that describes among the operation of the 3rd embodiment and above-mentioned second embodiment is identical, therefore omission is to the description of similar operations.
Figure 28 is a process flow diagram of describing the document registration process on the archive server.Be similar to second embodiment, the main body (and respective index) of the document that will register from external device (ED) (not shown) input by network interface 1310.Described in first embodiment, can in the document server, extract index (step S31) corresponding to the main body of the document of registering.
At first, in step S2801, document registration unit 1336 is at storage unit 1311 stored input document body and corresponding index.Then, program enters step S2802, determines at this collection step condition Discr. 1333 whether the index of storing drops under the collection condition that is stored in the collection condition storage unit 1332 in step S2801.
If the index of storing in step S2801 drops under the collection condition, program enters step S2803 from step S2802, will drop on index under the collection condition at this step index forwarder 1334 and be sent to archive server by the collection condition appointment.
In step S2803 under the situation of index corresponding to two or more collection conditions, all collect destination archive servers with index is sent to by the collection condition appointment.Program enters step S2804 then, determines the attribute of the document of registration at this step document properties Discr. 2701.In this case, detect the color type (colored or monochromatic) and the image size of document.
Then, program enters step S2805, determines in this step whether the color type of document definite in step S2804 is " colour ".If color type is " colour ", program enters step S2806.In step S2806,, determine to show with colour the archive server of basis image of the size of definite attribute in step S2804 by reference documents server-capabilities storage unit 2702.If there is any archive server that can carry out this demonstration, program enters step S2807, at this step index forwarder 1334 index is sent to and is defined as the archive server that can show in step S2806.
Then the example that document is registered is described with reference to figure 29A and 29B.
Figure 29 A shows the figure corresponding to the example document attribute of the document of registration.In Figure 29 A, show the color type and the image size of document.
Figure 29 B shows the figure of the sample table of the ability of pointing out to be stored in each archive server in the archive server ability storage unit 2702.In Figure 29 A, color type of the document that acquisition will be registered (colour) and image size (1024 * 768).Based on Figure 29 B, determine and to show that with colour size is suitable for showing the document greater than the archive server 2603 of the image of image size (1024 * 768).Therefore, the index with the document is sent to archive server 2603., show herein,, can not select them owing to scarce capacity about the image size though archive server 2601 and 2604 can carry out colour.
With top processing, can with the index copy (registration) of the document of newly registering have can archive server with the display unit of the big or small display document of colored and this image in.
By this way, can index be collected in hope operate in the archive server that is used to retrieve based on the characteristic of the document of registering.As a result, retrieval becomes possibility fast.
(other embodiment)
Though describe embodiments of the invention above in detail, the present invention can be applied to comprise the system of two or more equipment, and can be applied to the device that is made of individual equipment.
Should be noted that the software program of function that can be by will realizing the foregoing description is direct or long-range offers system or device, and allows the computing machine of described system or device to read and carry out the program that provides to realize the present invention.In this case, the form of program is optional for purposes of the invention, as long as the function of this program is provided.
Therefore, install on computers so that realize on one's body certainly that at computing machine the program code of function treatment of the present invention is used to realize the present invention.In other words, the computer program itself of realizing function treatment of the present invention is also included within the scope of the present invention.In this case, as long as have the function of this program, it can adopt various forms, such as object code, the program of explain carrying out, offer the script data of OS etc.
Be used to provide the storage medium of program for example can comprise following medium.That is, floppy (registered trademark) dish, hard disk, CD, magneto-optic disk, MO, CD-ROM, CD-R, CD-RW, tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD-R) etc.
Replacedly, can provide program in the following method.That is, use the browser access the Internet home page of client computer, so that on program downloaded to storage medium such as hard disk etc. from homepage.In this case, can download computer program of the present invention or have the compressed file of automatic installation function.In addition, can also be divided into two or more files, and make each file download this program that provides from different homepages by the program code that will constitute program of the present invention.In other words, the www server that allows two or more users will be used to realize that the program file of function treatment of the present invention downloads on the computing machine is also included within the scope of the present invention.
In addition, it is also contemplated that the following example that is used to provide,, it is stored on the storage medium such as CD-ROM etc., and between the user, distributes wherein to program encryption of the present invention.In this case, allow the user who satisfies certain criterion to download the key information that is used to decipher from homepage by the Internet, and by using this key information to make encrypted program install on computers with executable form.
In addition, the embodiment except carry out the embodiment of function that program that it reads realizes the foregoing description by the order computing machine also is feasible.For example, based on the instruction of this program, operation OS on computers etc. partly or entirely carries out actual treatment, and can realize the function of the foregoing description based on this processing.
In addition, the program of reading from storage medium can be write in the storer that provides in expanding element that is connected to computing machine or the expansion board of the inserting computing machine.In this case, after program was write storer, based on the instruction of program, the CPU that provides in expanding element or the expansion board etc. carried out actual treatment partially or completely, and handled the function that realizes the foregoing description based on this.
As mentioned above, share by the index of avoiding server, embodiments of the invention can suppress the increase of the indexed data amount in the multiserver searching system.
In addition, when collecting for example frequent retrieval or the frequent index that uses, and the designated so that key word of collection index is carried out the retrieval that utilizes this key word from the index of collecting during corresponding to described index.Therefore, improved recall precision.
Though reference example embodiment has described the present invention, should be appreciated that to the invention is not restricted to disclosed example embodiment.The scope of following claim is consistent with the wideest explanation, thereby comprises all this modifications, equivalent structure and function.

Claims (13)

1, a kind of DRS that is used for search file from the document that two or more document management servers that connect by network are registered, described DRS comprises:
Each storage unit that provides of document management server is provided, is used to store document and corresponding to the index data of described document;
Collector unit is used for collecting the index data corresponding to the document that satisfies first search condition from the index data that each described storage unit of described document management server is stored;
Determining unit is used for determining whether second search condition is identical with first search condition, wherein specifies described second search condition by the user; With
Retrieval unit is used for determining under second search condition situation identical with first search condition in described determining unit, by the index data search file of reference by described collector unit collection.
2, DRS as claimed in claim 1 also comprises:
The acquisition of information administrative unit is used to manage acquisition of information, and described acquisition of information comprises the information of information of indicating first search condition and the document management server of indicating the index data of storing described collector unit collection therein,
Wherein determine under second search condition situation identical with first search condition in described determining unit, based on by described acquisition of information administrative unit information of managing, the document management server of described index data is collected in described retrieval unit visit within it, and search file.
3, DRS as claimed in claim 1 also comprises:
The log information management unit is used to manage the log information by the search condition of user's appointment; With
Control module, be used under situation based on the search condition of determining the satisfied predetermined collection condition of existence by the log information of described log information management Single Component Management, control described collector unit, so that collect index data corresponding to the document that satisfies described search condition.
4, DRS as claimed in claim 3, wherein said predetermined collection condition is such condition, promptly certain search condition has been specified pre-determined number by the user at least.
5, DRS as claimed in claim 1, wherein said collector unit is collected the index data corresponding to the document that satisfies first search condition, so that this index datastore is being stored in described document management server in the document management server of maximum documents that satisfies first search condition.
6, DRS as claimed in claim 1 also comprises:
Registration unit is used in one of them of the described document management server new document of registration and corresponding to the index data of described new document,
Wherein satisfy under the situation of first search condition at described new document, described registration unit is described new document of registration and index data in the described document management server of the described index data that storage is collected by described collector unit.
7, a kind of document retrieving apparatus is used for the document search file of registering in two or more document management servers that are connected to described document retrieving apparatus by network, described document retrieving apparatus comprises:
Input block is used to receive search condition so that search file;
The log information management unit is used to manage the log information of the search condition that is received by described input block;
Determining unit is used for based on the log information by described log information management Single Component Management, determines whether to exist the search condition that satisfies predetermined collection condition; With
Collector unit is used for determining to exist under the situation of the search condition that satisfies described predetermined collection condition in described determining unit, collects the index data corresponding to the document that satisfies described search condition in described document retrieving apparatus.
8, document retrieving apparatus as claimed in claim 7, wherein said predetermined collection condition is such condition, promptly certain search condition has received pre-determined number at least by described input block.
9, document retrieving apparatus as claimed in claim 7 also comprises:
Recognition unit is used to discern the attribute of document,
Wherein determine to exist under the situation of the search condition that satisfies described predetermined collection condition in described determining unit, described collector unit is collected the index data of the document of selecting corresponding to the recognition result based on described recognition unit in each document that satisfies described search condition.
10, document retrieving apparatus as claimed in claim 9 also comprises:
The ability information administrative unit is used to manage the ability information about the function of described document retrieving apparatus,
Wherein determine to exist under the situation of the search condition that satisfies described predetermined collection condition in described determining unit, described collector unit is collected the index data of the document of the ability information selection of managing corresponding to recognition result and described ability information administrative unit based on described recognition unit in each document that satisfies described search condition.
11, as the document retrieving apparatus of claim 10, comprise at least being used to show the display unit of described document or being applicable to the printer of printing described document,
Wherein said ability information administrative unit is managed the ability information about the function of the function of described display unit or described printer at least.
12, a kind of document retrieval method that is used for search file from the document that two or more document management servers that connect by network are registered, described method comprises step:
The memory cell of each in described document management server storage document and corresponding to the index data of described document;
From described storing step, be stored in the index data in the described storage unit, collect index data corresponding to the document that satisfies first search condition;
Determine that whether second search condition is identical with first search condition, wherein specifies described second search condition by the user; With
In described determining step, determine under second search condition situation identical, by carrying out retrieval with reference to the index data of in described collection step, collecting with first search condition.
13, a kind of document retrieval method that is used for document retrieving apparatus, described document retrieving apparatus are used for the document search file of registering in two or more document management servers that are connected to described document retrieving apparatus by network, described method comprises step:
The input search condition is so that search file;
The log information of the search condition that management is imported in described input step;
Based on the log information of in described log information management step, managing, determine whether to exist the search condition that satisfies predetermined collection condition; With
In described determining step, determine to exist under the situation of the search condition that satisfies described predetermined collection condition, in described document retrieving apparatus, collect index data corresponding to the document that satisfies described search condition.
CNB2007100885240A 2006-03-14 2007-03-14 Document retrieving system, document retrieving apparatus, and method thereof Expired - Fee Related CN100498791C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2006069902 2006-03-14
JP2006069902 2006-03-14
JP2007032681 2007-02-13

Publications (2)

Publication Number Publication Date
CN101059811A true CN101059811A (en) 2007-10-24
CN100498791C CN100498791C (en) 2009-06-10

Family

ID=38865917

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007100885240A Expired - Fee Related CN100498791C (en) 2006-03-14 2007-03-14 Document retrieving system, document retrieving apparatus, and method thereof

Country Status (1)

Country Link
CN (1) CN100498791C (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
CN103425629A (en) * 2012-05-24 2013-12-04 富士通株式会社 Generation apparatus, generation method, searching apparatus, and searching method
CN106156266A (en) * 2015-05-12 2016-11-23 富士施乐株式会社 Information processor and information processing method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006072744A (en) * 2004-09-02 2006-03-16 Canon Inc Document processor, control method therefor, program and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
CN103425629A (en) * 2012-05-24 2013-12-04 富士通株式会社 Generation apparatus, generation method, searching apparatus, and searching method
CN103425629B (en) * 2012-05-24 2017-05-03 富士通株式会社 Generation apparatus, generation method, searching apparatus, and searching method
CN106156266A (en) * 2015-05-12 2016-11-23 富士施乐株式会社 Information processor and information processing method
CN106156266B (en) * 2015-05-12 2019-10-29 富士施乐株式会社 Information processing unit and information processing method

Also Published As

Publication number Publication date
CN100498791C (en) 2009-06-10

Similar Documents

Publication Publication Date Title
CN1314226C (en) Multi-media file sharing method and device
CN1163831C (en) Computer readable recorded medium on which image file is recorded, device for producing recorded medium and medium on which image file creating program is recorded
CN1137443C (en) Equipment control system
CN1206599C (en) Multifunction unit, service unit having same and network system
CN1191538C (en) Multi-target links for navigating between hypertext documents and the like
CN1260665C (en) Electronic apparatus for managing changeable storage medium, method thereof and storage medium
CN1627248A (en) Information processing apparatus, information processing method, information processing program, and storage medium
CN1667609A (en) Document information management system and document information management method
CN101079940A (en) Multi-function peripheral and information acquisition system including a plurality of the multi-function peripherals
CN1959642A (en) Information processing method, apparatus and system
CN1650274A (en) Operation managing method and operation managing server
CN1680942A (en) Document group analyzing apparatus, a document group analyzing method, a document group analyzing system
CN1728141A (en) Phrase-based searching in an information retrieval system
CN1728143A (en) Phrase-based generation of document description
CN1728140A (en) Phrase-based indexing in an information retrieval system
CN1728142A (en) Phrase identification in an information retrieval system
CN1617142A (en) Information managing method and information managing device
CN1493973A (en) Chaining information making apparatus and method
CN1270346A (en) Digit processing method and apparatus
CN1722108A (en) Disk drive, control method thereof and disk-falsification detection method
CN1707468A (en) Method and apparatus for processing data, program, and storage medium on which a computer-readable program is stored
CN1532751A (en) Service processer, service processing method
CN1839401A (en) Information processing device and information processing method
CN1382271A (en) Automatic measuring apapratus, automatic measurement data processing and control apparatus, network system and record medium of automatic measurement processing and control program
CN1700201A (en) Image processing device, image processing system and image processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090610

CF01 Termination of patent right due to non-payment of annual fee