WO2014034383A1 - Dispositif de traitement d'informations, procédé de spécification d'informations d'emplacement de document et programme de traitement d'informations - Google Patents

Dispositif de traitement d'informations, procédé de spécification d'informations d'emplacement de document et programme de traitement d'informations Download PDF

Info

Publication number
WO2014034383A1
WO2014034383A1 PCT/JP2013/071127 JP2013071127W WO2014034383A1 WO 2014034383 A1 WO2014034383 A1 WO 2014034383A1 JP 2013071127 W JP2013071127 W JP 2013071127W WO 2014034383 A1 WO2014034383 A1 WO 2014034383A1
Authority
WO
WIPO (PCT)
Prior art keywords
record
item
database
value
position information
Prior art date
Application number
PCT/JP2013/071127
Other languages
English (en)
Japanese (ja)
Inventor
古庄 晋二
Original Assignee
株式会社ターボデータラボラトリー
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ターボデータラボラトリー filed Critical 株式会社ターボデータラボラトリー
Publication of WO2014034383A1 publication Critical patent/WO2014034383A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the present invention relates to database management technology, and particularly to management technology for large-scale data stored in a distributed manner.
  • Search that accumulates data, retrieves necessary data from it, and presents it is the basic role of the database management device.
  • An index is essential for speeding up this search. Examples of the existing index include B-Tree and hash (for example, see Non-Patent Document 1).
  • Non-Patent Document 1 Douglas Comer “The Ubiquitous B-Tree”, Computing Surveys, June 1979, Vol 11, No. 1 2, p121-p137
  • the conventional index cannot handle large-scale data or data obtained in a distributed manner.
  • an index in a large-scale database has a property that a necessary storage capacity does not increase rapidly even if the database becomes large.
  • the size is preferably O (n).
  • the data is acquired without being server-less, and the data acquired in each place is distributed and managed as it is, and can be freely accessed via the network. These cannot be realized with the current index.
  • the present invention has been made in view of the above circumstances, and an object of the present invention is to provide a technology that can manage a large-scale database at low cost without restrictions on the use environment and provide an easy-to-use environment.
  • the present invention is an index for specifying position information of a desired record from a database in which a plurality of records each having a unique record number is stored, and returns a record number of a specified value and a predetermined item
  • An information processing apparatus including a position information specifying unit that specifies position information of a record using an index that returns a record number corresponding to the rank after sorting is provided.
  • the size of this index is proportional to the original database size.
  • FIG. 1 It is a block diagram of the database system of a first embodiment.
  • (A)-(d) is explanatory drawing for demonstrating the database of 1st embodiment.
  • (A)-(d) is explanatory drawing for demonstrating the database of 1st embodiment. It is explanatory drawing for demonstrating the virtual integrated data and virtual integrated sort data of 1st embodiment.
  • (A)-(c) is explanatory drawing for demonstrating the index file for every data item of 1st embodiment.
  • (A) And (b) is explanatory drawing for demonstrating the index file for every table of 1st embodiment. It is a flowchart of the 1st search process of 1st embodiment.
  • an information processing apparatus including a position information specifying unit that specifies position information of a record using an index that returns a record number corresponding to a rank after sorting by a predetermined item.
  • the size of this index is proportional to the original database size.
  • an information processing apparatus that manages a database including records that store item values for predetermined data items, using an index file for each data item that can be searched, and the index file
  • a position information specifying unit that specifies position information of a desired record, and each record is uniquely given a record number in advance, and the position information specifying unit specifies the record number as the position information.
  • the index file for each data item can acquire the record number from the item value of the data item, and can acquire the record number from the order of the sort database in which the data item is sorted as a key item.
  • a database ID is uniquely assigned to each database in advance
  • the index file is generated for each database
  • the sort database includes the plurality of databases.
  • the virtual integrated database virtually integrated is sorted using the data items as key items, and the position information specifying unit further specifies the database ID of the database to which the desired record belongs as the position information. You may do.
  • the index file for each data item stores a value list for storing unique item values belonging to the data item in a predetermined order, and the cumulative number of records in the database for each item value in the order in which the value list is stored. And a sort list that stores the order of the record numbers after sorting in the predetermined order using the data item as a key item.
  • the index file for each data item includes the database, the data item of the database, the sort list that stores the order of the record numbers after sorting the data item as a key item in a predetermined order, and the data item of the database And an original data list that stores the item values in the initial arrangement order.
  • the position information specifying unit may include a first search unit that uses the index file for each data item and specifies the position information of the specified item value of the data item. Further, the position information specifying unit may include a second search unit for specifying the position information of a specified position in the sort database using an index file for each data item. The position information specifying unit may further include, for each item value for each data item, a record number calculation unit that calculates, for each database, the number of records smaller than the item value and the number of records equal to the item value. .
  • a record extracting unit that extracts the desired record from the database according to the position information specified by the position information specifying unit may be further provided.
  • a record position information specifying method for specifying position information of a desired record in a database including a record storing item values for each predetermined data item and a record number uniquely assigned to each record.
  • An index file that can acquire the record number from the item value of the data item, and that can acquire the record number from the order of a sorted database obtained by sorting the database using the data item as a key item,
  • a record position information specifying method characterized by including a position information specifying step for specifying the position information by specifying the record number of the desired record using an index file generated every time.
  • the position information specifying step includes a first search step of specifying a data item and an item value of the desired record and specifying the record number of the desired record using the index file of the data item. May be. Further, the position information specifying step receives the designation of the data item of the record and the rank in the sort database as the desired record, and uses the index file of the data item, and the desired information in the sort database.
  • a second search step for specifying the record number of the record may be provided. In the second search step, the position information specifying step calculates, for each database, the number of records smaller than the item value and the number of records equal to the item value for each item value of the data item of the record.
  • a record number calculating step may be provided.
  • the integrated database is a virtual integrated sort database in which the data items are sorted as key items.
  • the database ID of the database to which the desired record belongs may be further specified as the position information. Good.
  • an information processing apparatus including a position information specifying unit that specifies position information of a desired record, a database including records storing item values for each predetermined data item stored in a storage device, A record position information specifying method for specifying position information of a record having a target value that is a predetermined item value of a target item that is a predetermined data item in a database in which a record number is uniquely assigned in advance.
  • the storage device further stores an index file for each of the data items that can be searched.
  • the index file stores a value list that stores unique item values belonging to the data item in a predetermined order, and the value list. In the storage order, the cumulative number of records in the database is stored for each item value.
  • the presence / absence determination step for determining whether or not the target item of the database has the target value, and when determined to be present in the presence / absence determination step, using the cumulative number list and the sort list,
  • a record position information specifying method comprising: specifying a record number of the target value and specifying a record number as the position information.
  • the information storage device includes a record that stores an item value for each predetermined data item stored in a storage device.
  • a record position information specifying method for specifying position information of a record having a target value that is a predetermined item value of a target item that is a predetermined data item in a database in which a record number is uniquely assigned in advance The storage device further stores an index file for each of the data items that can be searched. The index file stores the order of the record numbers after the database is sorted in a predetermined order using the data item as a key item.
  • Sort list to perform and the value of the data item in the database An original data list stored in an initial arrangement order, accessing the original data list of the target item, and whether or not the target item of the database has the target value If it is determined in the presence / absence rank determination step that determines the rank and the presence / absence rank determination step, the rank of the original data list is specified as the record number of the target value, and the position information And a record number specifying step, wherein the record position information specifying method is provided.
  • an information processing apparatus including a position information specifying unit that specifies position information of a desired record, a plurality of databases including records storing item values for each predetermined data item stored in a storage device, Virtual records in a virtual integrated sort database in which a plurality of databases are virtually integrated and sorted as key items in a plurality of databases in which record numbers are uniquely assigned in advance to each record.
  • the storage device further stores an index file for each of the data items that can be searched for each database, The index file stores unique item values belonging to the data item.
  • a search value determination step for determining a search value including a position, the value list of the key item, the cumulative number list, and the sort list, and a search value corresponding to the target position in the determined search value is A position information specifying step for specifying the table to which the table belongs and the rank in the table as the position information.
  • An information processing apparatus including a position information specifying unit that specifies position information of a desired record includes a record that stores an item value for each predetermined data item stored in a storage device.
  • a target position which is a virtual position in a virtual integrated sort database obtained by virtually integrating the plurality of databases and sorting predetermined data items as key items in a plurality of databases assigned record numbers to
  • a record position information specifying method for specifying position information of a record wherein the storage device further stores an index file for each data item that can be searched for each database, and the index file is stored in the database. Are sorted in the specified order using the data item as a key item.
  • a record position information specifying method comprising: a table to which a search value corresponding to the target position in a search value belongs; and a position information specifying step for specifying a rank in the table as the position information.
  • a record extraction method for extracting a desired record from a database comprising a record for storing an item value for each predetermined data item and a record number uniquely assigned to each record.
  • a record extraction method including a record extraction step of extracting the desired record in accordance with position information specified by the method.
  • the computer is a plurality of databases each of which stores values for each predetermined data item, and each database of each database is assigned with a unique record number in advance.
  • An information processing program that functions as position information specifying means for specifying position information of a desired record using an index file included in the index file, wherein the index file is generated from each of the databases, and for each data item,
  • an information processing program for acquiring the record number from the item value of a data item and acquiring the record number from the rank of a sorted database obtained by sorting the data item as a key item.
  • the information processing program may be provided by being recorded on a computer-readable storage medium.
  • a first information processing apparatus that manages a database that is connected via a network and that includes records that store item values for each predetermined data item, and a second information processing that specifies position information of the desired record
  • the first information processing apparatus includes an index file for each data item that can be a search target, and each record is uniquely assigned a record number in advance,
  • the index file for each data item can acquire the record number from the item value of the data item, and can acquire the record number from the order of the sort database in which the data item is sorted as a key item.
  • the second information processing apparatus specifies the record number as the position information. To provide a database system for the butterflies.
  • a database ID is uniquely assigned to each database in advance
  • the index file is generated for each database
  • the sort database is the plurality of databases.
  • a virtual integrated database obtained by virtually integrating the databases is a database in which the data items are sorted as key items
  • the second information processing apparatus uses the database ID of the database to which a desired record belongs as the position information. You may comprise so that it may specify further.
  • at least one database among the plurality of databases to be managed may be stored on different first information processing apparatuses connected to the network.
  • FIG. 1 is a diagram for explaining an outline of a database system 100 according to an embodiment of the present invention and functional blocks of an information processing apparatus provided in the database system 100.
  • a plurality of information processing apparatuses 110-0, 110-1, and 110-2 are connected via a network 120.
  • the information processing apparatus 110 represents the information processing apparatus.
  • the case where three information processing apparatuses 110 are connected to the network 120 is shown, but the number of information processing apparatuses 110 connected is not limited thereto.
  • Each information processing apparatus 110 functions as a data management apparatus that manages a database held by each information processing apparatus 110 while holding a database described later.
  • a data management device for example, a database browsing function, a search function, and the like are also provided.
  • Each information processing apparatus 110 includes a CPU 111, a memory 112, and a storage device 113.
  • NWIF network interface
  • Each information processing device 110 is connected to an input device 115 and a display device 116 which are user interfaces of the information processing device 110.
  • an external storage device 117 may be connected.
  • the information processing apparatuses 110-0, 110-1, and 110-2 store the databases 200-0, 200-1, and 200-2, respectively.
  • the database is represented by the tabular data 201 when it is not necessary to distinguish between the databases.
  • the database 200 is stored in the storage device 113 or the external storage device 117 of each information processing apparatus 110.
  • the information processing apparatuses 110-0, 110-1, and 110-2 are index files 300-0, 300-1, and 300 of the databases 200-0, 200-1, and 200-2, respectively. -2. If there is no need to distinguish the index file, the index file 300 is representative.
  • the index file 300 is stored in the storage device 113 or the memory 112 of each information processing device 110.
  • the index file 300 is created at an arbitrary time interval. For example, it is created every time a predetermined amount of data is collected.
  • the database of the present embodiment may be structured tabular data, semi-structured data, or unstructured data.
  • the structured tabular data 201 is an array of one or more records (rows) 213 including item values 212 corresponding to one or more data items (columns) 211 as shown in FIG.
  • Each record 213 is given a record number (RecNo.) 214.
  • This record number is information indicating the position where the record is stored in the tabular data 201.
  • This record number is given to the tabular data 201 at a predetermined timing.
  • the predetermined timing is, for example, the time when the tabular data 201 is created.
  • each record can be accessed by designating a record number.
  • the records are not always arranged in the order of the record numbers (Rec No.) 214.
  • the tabular data 201 at the time of creation (referred to as the original tabular data 201) is sorted so that the item values 212 are arranged in ascending order using the predetermined data item 211 as a key item
  • the sorted tabular format The order of records in the data 201 s is different from the order of records in the original tabular data 201.
  • FIG. 2B shows a sorting result when the tabular data 201 is sorted in ascending order using the data item 211 “Name” as a key item.
  • information indicating the order of records in the database 200 of each aspect is referred to as a record order number (rank) 215.
  • the record order number 215 matches the record number (RecNo.) 214.
  • FIG. 2A exemplifies five records 213 including three items ⁇ Gender>, ⁇ Name>, and ⁇ Age> as the data item 211.
  • FIG. for example, in the record 213 with the record number 214 being 0, the item value 212 of the ⁇ Generator> data item 211 is “female”, the item value 212 of the ⁇ Name> data item 211 is “Jemi”, and the data item 211.
  • the item value 212 of ⁇ Age> is “2”.
  • the number of data items 211 and the number of records 213 are not limited thereto.
  • the item value 212 may be either numeric data or text data, but it is assumed that the order can be uniquely assigned. For example, numerical data such as 2, 1,... Is stored as the item value 212 of ⁇ Age> in the data item 211, and text data such as Jemi, Griza,... As the item value 212 of the ⁇ Name> is stored in the data item 211. Stored.
  • the data item 211 of the tabular data 201 of this embodiment is a repeated item that can store a plurality of item values 212 in each record 213. Also good.
  • the case where the data item 211 of ⁇ Name> is a repetition item is illustrated.
  • the plurality of item values 212 stored in the repeated item does not matter in the normal order. That is, the tabular data 201 of FIG. 2C and the tabular data 201 shown in FIG. 2D are considered to be logically the same.
  • the semi-structured data 202 basically has the same configuration as the tabular data 201. That is, it is an array of one or more records including item values 212 corresponding to one or more data items 211. However, in the semi-structured data 202, the data item 211 includes a data item 211 that is guaranteed to have a value and a data item 211 that is not guaranteed.
  • ⁇ ID> is a data item 211 that is guaranteed to have a value
  • other ⁇ name>, ⁇ address>, ⁇ gender>, ⁇ age>, ⁇ food> Is a data item 211 that is not guaranteed.
  • the unstructured data 203 also basically has the same configuration as the tabular data 201. That is, it is an array of one or more records 213 including item values 212 corresponding to one or more data items 211. However, in the unstructured data 203, there is no data item for which data is guaranteed to exist.
  • the semi-structured data 203 and the unstructured data 204 are mapped to a structure similar to that of the tabular data 201 as shown in FIGS. 3C and 3D, respectively. I do.
  • the handling of the item value 212 without a value is determined in advance.
  • the NULL item is described as being handled as the minimum value of each data item 211.
  • the tabular data 201 included in each information processing apparatus 110 is referred to as a table.
  • Each table is uniquely assigned an identification number i in advance.
  • the tabular data 201-0, 201-1 and 201-2 are referred to as Table 0, Table 1 and Table 2 with identification numbers 0, 1 and 2, respectively.
  • a plurality of tables may be provided in one information processing apparatus 110.
  • the identification number i of each table is called a table ID.
  • the information processing apparatus 110 specifies position information of a desired record from a table group that is distributed and managed.
  • a database obtained by virtually integrating a table group that is distributed and managed in the order of table IDs is referred to as a virtual integrated database (virtual integrated DB).
  • a database in which the virtual integrated DB is sorted using predetermined data items as key items is referred to as a virtual integrated sort database (virtual integrated sort DB).
  • the record order number of the virtual integrated sort DB is called a virtual row (Vrec).
  • FIG. 4 is a diagram for explaining the virtual integrated DB and the virtual integrated sort DB.
  • the search target table group is a table 0 (Table 0) and a table 1 (Table 1) is illustrated.
  • the virtual integration DB 500 is a table in which table 0 and table 1 are virtually integrated in the order of table IDs.
  • the virtual integrated sort DB 510 is obtained by sorting the virtual integrated DB 500 using predetermined data items (here, ⁇ Name>) as keys.
  • the item 501 indicates a table ID and a record number.
  • the table 0 is the tabular data 201 shown in FIG. 2A, and is structured tabular data having five records.
  • Table 1 is unstructured data with six records and NULL items.
  • the information processing apparatus 110 searches the table group, and records 213 having the item value 212 designated by the data item 211. Identify and return location information.
  • the position information is a table ID and a record number of a table (affiliation table) to which the record 213 equal to the item value 212 belongs.
  • the user specifies a data item 211 as a key item when generating the virtual integrated sort DB 510 and a virtual row (Vrec), position information of the record 213 of the virtual row (Vrec) is returned.
  • FIG. 5 shows a functional block diagram of the information processing apparatus 110 that realizes the above functions.
  • the information processing apparatus 110 of this embodiment includes an index creation unit 410 and a position information identification unit 420.
  • Each of these functions is realized by the CPU 111 included in the information processing apparatus 110 loading a program stored in the storage device 113 in advance into the memory 112 and executing it. Details of each part will be described below.
  • the index creation unit 410 creates the index file 300 from the tabular data 201 at an arbitrary time interval.
  • the index file 300 created by the index creation unit 410 of this embodiment will be described.
  • the index file 300 according to the present embodiment includes one or more elements provided to speed up the process of specifying the position of a desired record 213 from the tabular data 201 managed on each information processing apparatus 110.
  • FIG. 6 is a diagram for explaining the index file 300 of the present embodiment.
  • the index creation unit 410 according to the present embodiment creates the following index files 300 for all tables that are distributed and managed.
  • the index file 300 created from the tabular data 201 shown in FIG. 2A will be described as an example.
  • the index file 300 is generated for each data item 211 of the tabular data 201.
  • the data item 211 for creating the index file 300 is called an item of interest.
  • 6A shows an example in which the item of interest is ⁇ Gender>
  • FIG. 6B shows an example in which the item of interest is ⁇ Name>
  • FIG. 6C shows an example in which the item of interest is ⁇ Age>.
  • the index file 300 according to the present embodiment includes a value list (VL) 310, an accumulation number list (CAGR) 320, and a sort list (SOS) 330.
  • Each list is composed of an element and a rank (Ord) indicating a record sequence number as its position.
  • Each list can be extracted from each list by specifying the rank (Ord). Further, the element of the rank j starting from 0 in the list ABC is denoted as ABC [j].
  • VL310 is a list in which unique item values 212 appearing in the item of interest are sorted in a predetermined order (for example, ascending or descending order) and stored as elements. Specifically, the VL 310 generates the table format data 201 by sorting the table item data 201 in a predetermined order using the item of interest as a key, and suppressing the same value as the result (sorted table format data 201s).
  • the SOS 330 stores the tabular data 201 as an element in the arrangement order of the record numbers 214 when the item of interest is sorted as a key. Sorting is performed in the same order as VL310. By providing the SOS 330, the record number 214 corresponding to the sorted item value 212 can be freely extracted.
  • CAGR 320 stores an accumulated value of the number of records of each item value 212 as an element.
  • the number of records is accumulated in the order of VL310.
  • the CAGR 320 can know the storage range of each element of the VL 310 in the SOS 330. That is, when i is larger than 0, the element VL [j] of the VL310 is the section of [CAGR [j-1], CAGR [j]) of the SOS 330, that is, CAGR [j-1] to CAGR [j] Stored in order of -1. Note that the element VL [0] of the VL310 is stored in the rank of the section [0, CAGR [0]) of the SOS330.
  • a closed section is indicated by []
  • an open section is indicated by ().
  • the element “Grizza” of VL rank 1 will be described.
  • the element of rank 0 of CAGR 320 is “1”, and the element of rank 1 of CAGR 320 is “3”. Therefore, “Grizza” is stored in the range of the rank [1, 3) of the SOS 330, that is, the range of the rank [1, 2].
  • FIGS. 7A and 7B show an example of an index file 300 when the item of interest is ⁇ Name>.
  • 7A shows the index file 300 of the table
  • FIG. 7B shows the index file 300 of the table 1.
  • the position information specifying unit 420 searches the table group using the index file 300 in accordance with an instruction from the user, and specifies position information of a predetermined record.
  • the position information specifying unit 420 searches for a record having the item value 212 of the data item 211 in response to the designation of the data item 211 and the predetermined item value 212,
  • the data item 211 as the sort key item, and the virtual row (Vrec)
  • the record of the virtual row (Vrec) is searched and the position information is specified.
  • a second search unit 422 that calculates the number of records specified, and a record number calculation unit 423 that calculates the specified number of records.
  • the record number calculation unit 423 of the present embodiment prepares two functions represented by the following expressions (1) and (2), and when the first search unit 421 and the second search unit 422 search for position information, The number of records shown by the following formulas (3) and (6) is calculated. The calculation is performed using the VL310, CAGR320, and SOS330 of the designated data item 211.
  • VL (i) each list of the table (i) is referred to as VL (i), CAGR (i), and SOS (i), respectively.
  • CLTP (i) [j] obtained by Expression (1) is the number of records belonging to a value smaller than the item value of the order j of VL (i).
  • CEQP (i) [j] obtained by Expression (2) is the number of records belonging to a value equal to the item value of rank j of VL (i).
  • CLTV (i) ⁇ x> obtained by Expression (3) is the number of records belonging to a value smaller than a predetermined item value x in the table i.
  • case1 is a case where the item value x exists in VL (i), and j is a rank in the VL (i) of the item value x.
  • Case 2 is a case where the item value x does not exist in VL (i), and j is the maximum item value when a value smaller than x exists in the item value of VL (i).
  • the order of Case 3 is a case where the item value x does not exist in VL (i), and a value smaller than x does not exist in the item value of VL (i).
  • CEQV (i) ⁇ x> obtained by Expression (4) is the number of records belonging to a value equal to the predetermined item value x in the table i.
  • case1 is a case where the item value x exists in VL (i)
  • j is a rank in the VL (i) of the item value x.
  • Case 2 is a case where the item value x does not exist in VL (i).
  • CALTV ⁇ x> obtained by Expression (5) is the number of records belonging to a value smaller than a predetermined item value x in the virtual integrated DB 500 and the virtual integrated sort DB 510.
  • CAEQV ⁇ x> obtained by Expression (6) is the number of records belonging to a value equal to a predetermined item value x in the virtual integrated DB 500 and the virtual integrated sort DB 510.
  • the first search unit 421 when the data item 211 and the item value are given by the user, the first search unit 421 returns the position information in the distribution management target table. That is, the table ID and record number of the record having the value are specified from the value.
  • the search is performed for VL (i) in the index file 300 having the data item 211 as the target item, and the presence / absence of the specified item value is present. Identifies its location.
  • the search for VL (i) is performed using a bisection method or the like.
  • the record number is specified by the above method using CAGR (i) and SOS (i).
  • FIG. 8 is a processing flow example of the first search process by the first search unit 421 of the present embodiment.
  • the number of tables to be searched is M (M is an integer of 1 or more). It is assumed that the table group to be searched is determined in advance. At this time, the search result is stored in the first search result storage area in the storage device 113.
  • a search target data item 211 (Target Item: TI) and an item value 212 (Target Value: TV) are given by the user
  • the index file 300 of the data item TI of the table i is accessed.
  • VL (i) is accessed and the item value TV is searched (step S1102).
  • the search is performed using a bisection method or the like. If the item value TV exists in VL (i), the rank is extracted, CAGR (i) is accessed, and the storage range of the item value TV in SOS (i) is specified by the above-described method (step S1103). ). According to the obtained storage range, the SOS (i) is accessed, and the record number 214 of the item value TV is obtained (step S1104). The obtained record number 214 is additionally stored in the first search result storage area in association with the table ID of the table being searched (step S1105).
  • step S1102 is repeated (steps S1106 and 1107).
  • step S1102 if the item value TV does not exist in VL (i) in step S1102, the process proceeds to step S1106 as it is and the process is repeated.
  • a set of table ID and record number stored in the first search result storage area is output as position information (step S1108).
  • the second search unit 422 returns the position information of the corresponding record. That is, the table ID and the record number 214 of the record of the designated virtual row TP of the virtual integrated sort DB 510 are specified.
  • the VL 310 is accessed in the order of the table ID, and a value at a predetermined position (for example, near the center) is extracted as a temporary search value (provisional search value).
  • a virtual row provisional virtual row
  • the obtained virtual virtual line is compared with the designated virtual line, and the search is repeated until they match. Then, the position information of the matching provisional search value is calculated.
  • the temporary virtual row of the temporary search value is calculated by Expression (5) and Expression (6) by the record number calculation unit 423.
  • the range of the temporary virtual row (rank) is [CALTV ⁇ provisional search value>, CALTV ⁇ provisional search value> + CAEQV ⁇ provisional search value>). That is, CALTV ⁇ provisional search value> to CALTV ⁇ provisional search value> + CAEQV ⁇ provisional search value> -1.
  • FIG. 9 is a processing flow example of the second search process by the second search unit 422 of the present embodiment.
  • the number of tables to be searched is M (M is an integer of 1 or more).
  • M is an integer of 1 or more.
  • an area for storing the search result in the storage device 113 is set as a second search result storage area.
  • an area that holds the value extracted as the temporary search value is set as a temporary search value storage area.
  • step S1201 When TP is given as a designated virtual row by the user, first, the table number to be searched and the second search result storage area are initialized (step S1201). Then, the index file 300 of the key item TI when creating the virtual integrated sort DB 510 in the table i is accessed.
  • VL (i) is accessed, and the provisional search value vp is determined according to a predetermined rule (step S1202).
  • a predetermined rule for example, the median is extracted as described above.
  • the rank of the temporary search value vp in the VL (i) is j.
  • the determined provisional search value vp and rank j are additionally registered in the provisional search value storage area (step S1203).
  • the record number calculation unit 423 is caused to calculate the range of the virtual row (temporary virtual row) of the temporary search value vp (step S1204).
  • the designated virtual row TP is compared with the range of the temporary virtual row (step S1205).
  • Designated virtual line TP is within the range of the temporary virtual line, temporary search value vp is, determines that the value V TP of the virtual line (Step S1209).
  • the value V TP performs position information specifying process of specifying a table ID and a record number of the virtual line TP (step S1210), the process ends.
  • step S1206 it is determined whether a new temporary search value can be determined in the table i according to a predetermined rule.
  • a new temporary search value can be determined in the table i according to a predetermined rule.
  • the temporary search value vp in VL (i) and the temporary search value already stored in the temporary search value storage area are used. It is determined between the maximum value among the values smaller than the search value vp.
  • the temporary search value vp in VL (i) and the temporary search value stored in the temporary search value storage area are larger than the temporary search value vp. Decide between the smallest of the values.
  • step S1207 If it can be determined, a new temporary search value vp is determined (step S1207), the process proceeds to step S1203, and the process is repeated.
  • step S1208 when the new temporary search value vp cannot be determined within the above range, the process moves to the next table (step S1208), and the process is repeated from step S1202.
  • FIG. 10 is a processing flow example of the position information specifying process of the present embodiment by the second search unit 422.
  • step S1301 affiliation table determination processing for determining the table ID of the table to which the affiliation belongs is performed.
  • step S1302 the total number AC (i) ⁇ V TP > of records having a value equal to the value V TP included in the table below i is calculated (step S1302).
  • AC (i) is calculated by the following equation (7).
  • the rank POS (i) ⁇ V TP > (calculated virtual row) in the virtual integrated sort DB 510 of the record having the largest rank among the records having a value equal to the item value V TP of the table i is determined.
  • This POS (i) ⁇ V TP > is obtained by the following formula (8) in which AC (i) ⁇ V TP > is added to the total number of records CALTV ⁇ V TP > having a value smaller than the item value V TP ( Step S1303).
  • the calculated virtual row POS (i) ⁇ V TP > is compared with the designated virtual row TP (step S1304).
  • the affiliation table of the record corresponding to the virtual row TP is determined as the table i (step S1305).
  • step S1304 if the calculated virtual row is smaller than the designated virtual row TP, the process moves to the next table (step S1310), returns to step S1302, and repeats the process.
  • the record order AA of the record corresponding to the virtual row TP among the records belonging to the value equal to the item value VTP in the table i is calculated (step S1307). This is obtained by subtracting 1 from the value obtained by subtracting POS (i ⁇ 1) ⁇ V TP > (or CALTV ⁇ V TP >) from the virtual row TP.
  • the order Ord in SOS (i) is calculated (step S1308).
  • the second search unit 422 accesses the index file 300 in which the item of interest shown in FIG. First, VL (0) of the table 0 is accessed, and for example, “Jemi” having a rank of 2 is extracted as the temporary search value vp. Then, the record number calculation unit 423 obtains the range of the rank of “Jemi” in the virtual integrated sort DB 510. Here, [6, 7] is obtained.
  • the designated virtual row TP is a smaller value outside this range, a smaller value is extracted again as the temporary search value vp in VL (0).
  • “Grizza” is set to vp.
  • [3, 5] is obtained as the range of rank in the virtual integrated sort DB 510 of “Grizza”. Since the virtual row TP is within the range, the temporary virtual value vp “Grizza” is set as the virtual row value V TP .
  • the process moves to the next table 1 and performs the same processing.
  • a virtual row in the virtual integrated sort DB 510 for the one with the highest rank of “Grizza” in Table 1, 5 is obtained. Since this is a value less than or equal to the virtual row TP, the affiliation table of the record of the virtual row TP is determined as 1.
  • the table ID of the table to which the table belongs and the record number are output as the position information.
  • the present invention is not limited to this.
  • sequential record numbers integrated record numbers
  • the integrated record number is obtained by adding the total number of records in a table having a table ID smaller than the own table to the record number of the own table.
  • the number of databases set as search targets may be one.
  • the first search unit 421 and the second search unit 422 search only the index file 300 of the database and return only the record number as the position information.
  • the record number of the record having the item value can be obtained.
  • the record number of the record can be obtained by designating a predetermined row of the sorted database using a predetermined data item as a key item.
  • each information processing apparatus 110 includes the index creation unit 110 and the position information identification unit 420
  • the position information specifying unit 420 is an information processing apparatus independent of the information processing apparatus 110 that holds the database, and may be provided with an information processing apparatus that can transmit and receive data to and from each information processing apparatus 110 that holds the database. Good.
  • the information processing apparatus 110 including the position information specifying unit 420 accesses the information processing apparatus 110 including the desired database 200 and the index file 300, and executes the processing by the position information specifying unit 420.
  • the user may select a database to be integrated and search for data.
  • a list of databases that can be selected by the user may be displayed and received from the list.
  • the user may specify the data item 211 and the item value 212 to be subjected to the first search process.
  • the user may instruct the designated virtual row TP for performing the second search process.
  • the information processing apparatus 110 may further include a display control unit.
  • the display control unit accesses the table according to the position information specified by the first search unit 421 or the second search unit 422, extracts records, and displays them in the display area of the display device 116. That is, the display control unit implements a record extraction function and a display function.
  • a search process in which a specific item value is specified can be realized.
  • the search process is realized as follows.
  • the first search unit 421 specifies the position information of the record having the item value 212 specified by the user.
  • the display control unit extracts the record from each table and displays it on the display area of the display device 116.
  • browsing processing of the virtual integrated sort DB 510 can be realized.
  • the browsing process is realized as follows.
  • the second search unit 422 specifies the position information of each record of a predetermined number of virtual rows including the virtual row TP designated by the user.
  • the position information of the virtual rows of the number of rows (here, L rows) that can be displayed in the display area of the display device 116 is specified.
  • the display control unit extracts these records from each table i and displays them in the display area of the display device 116 in the order of virtual rows. For example, each time the virtual row TP designated by the user is changed by a scroll operation or the like, this series of processing is performed to update the display.
  • the database 200 of this embodiment returns the position information of the record belonging to the item value 212, and the virtual row of the virtual integrated sort DB 510
  • TP is designated
  • an index file 300 is provided that returns position information of the virtual row TP.
  • the position information specifying unit 420 searches for a record designated by the user using the index file 300, and specifies the position information.
  • the database 200 is managed in a distributed manner, it is possible to return position information of records in a specified order in a virtually integrated and sorted state.
  • a user can easily search for a desired record using the index file 300 of the present embodiment, regardless of whether the database is single or distributed and managed in a plurality of databases.
  • the position information can be specified.
  • the use area of an index such as a B-tree conventionally used for searching a large amount of database increases (O (nlog (n)) at an accelerated rate as the amount of data in the original database increases.
  • the index file 300 of this embodiment is proportional to the size of the original database (O (n)), so that even if the size of the original database is enormous, the storage area is greatly increased. There is no pressure on you.
  • each list constituting the index file 300 of this embodiment can be accessed in order (Ord). Further, the above search is realized only by searching the index file 300. For this reason, the amount of communication between sites that are pre-distributed and managed for searching can be suppressed. Therefore, the communication amount does not increase when searching and extracting records.
  • the index file 300 of the present embodiment has a simple configuration as described above, it can be created regardless of the database type. Therefore, it is possible to easily specify and extract the position of desired data regardless of the database type to be managed. In addition, prior design for searching is not necessary.
  • the index file 300 has a large scale such that a very high-speed search can be realized and a database of 1 trillion records can be practically constructed. Furthermore, since the index file 300 of the present embodiment has a unique record number that is an index that can be used even between databases with different schemas, the index file 300 has wide-area dispersibility, and cooperation between databases that are behind each other is also possible. Is possible. Moreover, according to this embodiment, a server is not required. That is, a search is performed using the client CPU. For this reason, the number of CPUs to be added increases as the number of clients increases, and a large number of clients can be connected without difficulty. Further, since it is serverless, a server system and server software are unnecessary, and a database system can be constructed at a low cost.
  • Second Embodiment a second embodiment to which the present invention is applied will be described. Although it is the same function as 1st embodiment, a different index is used.
  • the database system of this embodiment is basically the same as the database system 100 of the first embodiment shown in FIG. The same applies to each device of the database system 100.
  • the index file 300 is different as described above. Therefore, the configuration of the index file 300 in the information processing apparatus 110 is different, and the processes of the index creation unit 410 and the position information identification unit 420 are different.
  • the applicable database types are also different.
  • the present embodiment will be described focusing on the configuration different from the first embodiment.
  • the functional configuration of the information processing apparatus 110 basically includes an index creating unit 410 and a position information specifying unit 420 as in the first embodiment shown in FIG. And the positional information specific
  • the index creation unit 410 creates the index file 300 from the tabular data 201 at an arbitrary time interval, as in the first embodiment. For example, it is created every time a predetermined amount of data is collected. However, the index file 300 to be created is different.
  • FIG. 12 is a diagram for explaining the index file 300 of the present embodiment.
  • the index creation unit 410 according to the present embodiment creates the following index files 300 for all tables that are distributed and managed.
  • the index file 300 of this embodiment is also one or more lists in an array format including one or more elements created for each data item 211 of the tabular data 201, as in the first embodiment.
  • the data item 211 for creating the index file 300 is referred to as a focused item.
  • the index file 300 created from the tabular data 201 shown in FIG. 2A of the first embodiment will be described as an example.
  • 12A is an example in which the item of interest is ⁇ Gender>
  • FIG. 12B is an example in which the item of interest is ⁇ Name>
  • FIG. 12C is an example in which the item of interest is ⁇ Age>.
  • the index file 300 includes a sort list (SOS) 330 and a list (original data list: ORG) 340 composed of data of an item of interest in the original table.
  • Each list includes an element and a rank (Ord) indicating its position.
  • Each list can be extracted from each list by specifying the rank (Ord).
  • the element of the rank j starting from 0 in the list ABC is denoted as ABC [j].
  • the configuration and creation method of the SOS 330 are the same as those in the first embodiment.
  • each list of the index file 300 is created for each table.
  • FIG. 13A and FIG. 13B show an example of the index file 300 when the item of interest is ⁇ Name>.
  • 13A shows the index file 300 of the table
  • FIG. 13B shows the index file 300 of the table 1.
  • SOS 330 and ORG 340 are used as the index file 300.
  • any of structured data, semi-structured data, and unstructured data may be used, as in the first embodiment.
  • one item value is stored in each data item.
  • the position information specifying unit 420 of this embodiment also specifies position information in accordance with an instruction from the user.
  • the first search unit 421 searches for a record having the item value 212 of the data item 211 and identifies position information.
  • the second search unit 422 searches for the record of the virtual row (Vrec) in the virtual integrated sort DB 510 in response to the designation of the data item 211 and the virtual row (Vrec) as the sort key items, and the position Returns information.
  • the first search process by the first search unit 421 searches for and specifies position information of a record having a specified value.
  • the search target data item 211 (Target Item: TI) and the item value 212 (Target Value: TV) are specified
  • the first search unit 421 of this embodiment searches the ORG 340 in the order of the table ID.
  • the search uses a conventional search method such as a two-division method.
  • the first search unit 421 additionally stores the record number and the table ID in the first search result storage area with the order (Ord) of the record as the record number every time it hits.
  • the second search process of the second search unit 422 of this embodiment also returns position information of the corresponding record when a key item and a virtual row (Vrec) of the virtual integrated sort DB 510 are designated by the user. That is, the table ID and the record number 214 of the record of the designated virtual row TP of the virtual integrated sort DB 510 are specified.
  • the ORG 340 is accessed in the order of the table ID, and a value at a predetermined position (for example, near the center) is extracted to obtain a provisional search value (provisional search value).
  • provisional search value provisional search value
  • a virtual row (temporary virtual row) in the sort DB 510 is obtained.
  • the obtained virtual virtual line is compared with the designated virtual line, and the search is repeated until they match. Then, the position information of the matching provisional search value is calculated.
  • the flow of the second search process of the present embodiment is basically the same as the second search process shown in FIGS. 9 and 10 of the first embodiment.
  • the determination method of the initial temporary search value vp in step S1202 the information stored in the temporary search value storage area in step S1203, and the determination method of the new temporary search value vp in step S1206 are different.
  • CLTV (i) ⁇ x> indicating the number of records belonging to a value smaller than the value x in the table (i) used in the second search process by the record number calculation unit 423 is the same as the x
  • the calculation method of CEQV (i) ⁇ x> indicating the number of records belonging to a value equal to is different from the first embodiment.
  • the above-described record number calculation process by the record number calculation unit 423 of this embodiment will be described.
  • the record number calculation unit 423 searches for ORG (i) and acquires the rank (Ord) in the table (i) when the value x is designated.
  • the calculation is performed using a two-division method or the like, and the search is performed until one rank (Ord) is designated.
  • CLTV (i) ⁇ x> is obtained with the value e1 of the minimum order of the storage range, and CEQV (i) ⁇ x> subtracts the minimum order e1 from the number in the storage range, that is, the maximum order e2. It is obtained as a value obtained by adding 1 to the obtained value.
  • the calculation method of the number of records CALTV ⁇ x> belonging to a value smaller than the value x and the number of records CAEQV ⁇ x> belonging to a value equal to the value x in the virtual integrated DB 500 used in the second search process is as follows. This is the same as the embodiment.
  • the first provisional search value vp is determined in the following procedure in each table i. That is, first, SOS (i) is accessed, and an element (ElementA) at a predetermined position (for example, near the center) is extracted. Then, the ORG 340 is accessed, and the element (ValueB) of the record having the element (ElementA) in the rank (Ord) is extracted and set as the provisional search value vp.
  • step S1203 the temporary search value vp, the rank (Ord) in ORG (i), and the rank (Ord) of the temporary search value vp in SOS (i) are also stored in this embodiment.
  • step S1206 the new provisional search value vp is sequentially determined by performing the bisection method in SOS (i). At this time, when the designated virtual row TP is smaller than the minimum value of the temporary virtual row, the rank of the current temporary search value vp in SOS (i) and the temporary search value already stored in the temporary search value storage area are used. The maximum value among the values smaller than the current provisional search value vp and the rank in the SOS (i) are determined.
  • the rank of the current temporary search value vp in SOS (i) and the temporary search value already stored in the temporary search value storage area are: It is determined between the rank in the SOS (i) of the minimum value among the values larger than the current provisional search value vp.
  • the second search unit 422 accesses the index file 300 whose table of interest is Name in Table 0 shown in FIG. Then, SOS (0) is accessed, and for example, element 0 having a rank of 3 is extracted. Then, ORG (0) is accessed, and the element “Jemi” with rank 0 is extracted as the provisional search value vp.
  • the range of the ranking of “Jemi” in the virtual integrated sort DB 510 is obtained.
  • [6, 7] is obtained. Since the virtual row TP is a smaller value outside this range, a value having a smaller rank is re-extracted as the temporary search value vp in SOS (0). For example, element 1 with rank 1 is extracted, ORG (0) is accessed, and element “Grizza” with rank 1 is set as a new temporary search value vp.
  • [3, 5] is obtained as the range of the rank of “Grizza” in the virtual integrated sort DB 510. Since the virtual row TP is within the range, “Grizza” is set as the value V TP of the virtual row.
  • the number of “Grizza” up to Table 0 is calculated (CALTV ⁇ Grizza>), and 2 is obtained. Further, the total number of values smaller than “Grizza” (CALTV ⁇ Grizza>) in the virtual integrated sort DB 510 is 3. Therefore, the virtual row in the virtual integrated sort DB 510 with the highest rank of “Grizza” in the table 0 is 4.
  • the process moves to the next table 1 and performs the same processing.
  • a virtual row in the virtual integrated sort DB 510 for the one with the highest rank of “Grizza” in Table 1, 5 is obtained. Since this is a value less than or equal to the virtual row TP, the affiliation table of the record of the virtual row TP is determined as 1.
  • the position information specifying unit 420 may be constructed in an information processing device independent of the information processing device 110 that holds the database. Furthermore, a display control unit similar to that of the first embodiment may be provided so that search processing, browsing processing, and the like can be realized. In addition, an interface that allows the user to specify item values to be specified and extraction targets, a virtual row, and an interface from which the user can select a database to be searched may be provided.
  • the configuration of the index file 300 is not limited to the configuration of each of the above embodiments. That is, it is created from the original database, the size and size of the original database are proportional, and given a predetermined data item and value, the position information of the record that satisfies it can be returned, and Any index file can be used as long as it is an index file that can be integrated virtually and can return position information of records in a specified rank in a state of being sorted by predetermined data items. For example, it may be a combination of a first list capable of determining the number of predetermined item values (including 0) and a second list capable of grasping the rank of each record after sorting by a predetermined data item. .
  • 100 Database system
  • 110 Index creation unit
  • 110 Information processing device
  • 111 CPU
  • 112 Memory
  • 113 Storage device
  • 114 NWIF
  • 115 Input device
  • 116 Display device
  • 117 External storage device
  • 120 Network
  • 200 Database
  • 201 Tabular data
  • 201s Tabular data after sorting
  • 202 Semi-structured data
  • 203 Semi-structured data
  • 203 Unstructured data
  • 204 Unstructured data
  • 212 item value
  • 213 record
  • 214 record number
  • 215 record order number
  • 300 index file
  • 310 VL
  • 320 CAGR
  • 340 ORG
  • 410 index Creation unit
  • 420 position information identification unit
  • 421 first search Part
  • 422 second search unit
  • 423 record number calculation unit
  • 500 virtual integration DB
  • 501 table ID and a record number
  • 510 virtual integration Sort DB

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne une technologie fournissant un environnement facilement utilisable dans lequel une grande base de données est administrée de façon peu coûteuse et sans restriction quant à l'environnement d'utilisation. L'invention concerne un dispositif de traitement d'informations comprenant une unité de spécification d'informations d'emplacement qui renvoie un numéro de document d'une valeur désignée qui constitue un index spécifiant des informations d'emplacement d'un document souhaité dans une base de données, une pluralité de documents étant stockés et dotés respectivement d'un numéro unique, et qui spécifie les informations d'emplacement du document au moyen de l'index qui renvoie le numéro de document correspondant à une commande après tri avec un article prescrit. La taille de l'index est proportionnelle à la taille de la base de données originale.
PCT/JP2013/071127 2012-08-29 2013-08-05 Dispositif de traitement d'informations, procédé de spécification d'informations d'emplacement de document et programme de traitement d'informations WO2014034383A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012-189041 2012-08-29
JP2012189041A JP2015207026A (ja) 2012-08-29 2012-08-29 情報処理装置、レコード位置情報特定方法および情報処理プログラム

Publications (1)

Publication Number Publication Date
WO2014034383A1 true WO2014034383A1 (fr) 2014-03-06

Family

ID=50183201

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/071127 WO2014034383A1 (fr) 2012-08-29 2013-08-05 Dispositif de traitement d'informations, procédé de spécification d'informations d'emplacement de document et programme de traitement d'informations

Country Status (2)

Country Link
JP (1) JP2015207026A (fr)
WO (1) WO2014034383A1 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6744179B2 (ja) * 2016-09-14 2020-08-19 株式会社エスペラントシステム データ統合方法、データ統合装置、データ処理システム及びコンピュータプログラム
JP2021067962A (ja) * 2018-02-21 2021-04-30 株式会社ターボデータラボラトリー 情報処理システム及び情報処理方法
JP6974666B1 (ja) * 2021-08-05 2021-12-01 株式会社インフォメックス 検索装置、検索方法、およびプログラム
JP6970867B1 (ja) * 2021-06-30 2021-11-24 株式会社インフォメックス 検索装置、検索方法、およびプログラム
CN115803730A (zh) * 2021-06-30 2023-03-14 株式会社英弗麦斯 检索装置、检索方法、以及记录介质
WO2023152965A1 (fr) * 2022-02-14 2023-08-17 晋二 古庄 Dispositif de fourniture de données, procédé de fourniture de données et programme

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0362239A (ja) * 1989-07-31 1991-03-18 Mitsubishi Electric Corp ファイル入出力装置
JPH06168274A (ja) * 1993-06-25 1994-06-14 Matsushita Electric Ind Co Ltd 情報検索装置
JPH1091644A (ja) * 1996-09-10 1998-04-10 Oki Electric Ind Co Ltd データベース問い合わせ処理方法及び装置
WO2008155852A1 (fr) * 2007-06-21 2008-12-24 Turbo Data Laboratories Inc. Procédé et dispositif pour repérer des données de forme tabulaire dans un système de traitement parallèle partagé en mémoire

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0362239A (ja) * 1989-07-31 1991-03-18 Mitsubishi Electric Corp ファイル入出力装置
JPH06168274A (ja) * 1993-06-25 1994-06-14 Matsushita Electric Ind Co Ltd 情報検索装置
JPH1091644A (ja) * 1996-09-10 1998-04-10 Oki Electric Ind Co Ltd データベース問い合わせ処理方法及び装置
WO2008155852A1 (fr) * 2007-06-21 2008-12-24 Turbo Data Laboratories Inc. Procédé et dispositif pour repérer des données de forme tabulaire dans un système de traitement parallèle partagé en mémoire

Also Published As

Publication number Publication date
JP2015207026A (ja) 2015-11-19

Similar Documents

Publication Publication Date Title
JP5873935B2 (ja) データベースの管理方法、管理計算機及び記憶媒体
US10268652B2 (en) Identifying correlations between log data and network packet data
WO2014034383A1 (fr) Dispositif de traitement d'informations, procédé de spécification d'informations d'emplacement de document et programme de traitement d'informations
KR101443475B1 (ko) 검색 제안 클러스터링 및 프리젠테이션
CN101055585B (zh) 文档聚类系统和方法
CN101055580B (zh) 用于检索文档的系统、方法及用户接口
JP2017157192A (ja) キーワードに基づいて画像とコンテンツアイテムをマッチングする方法
CN103136228A (zh) 一种图片搜索方法以及图片搜索装置
US11748351B2 (en) Class specific query processing
JP2014134991A (ja) パターン抽出装置および制御方法
US20110082803A1 (en) Business flow retrieval system, business flow retrieval method and business flow retrieval program
US9407589B2 (en) System and method for following topics in an electronic textual conversation
JP6390139B2 (ja) 文書検索装置、文書検索方法、プログラム、及び、文書検索システム
US8458187B2 (en) Methods and systems for visualizing topic location in a document redundancy graph
WO2011134141A1 (fr) Procédé permettant d'extraire une entité désignée
US20120239657A1 (en) Category classification processing device and method
JP2016157290A (ja) 文書検索装置、文書検索方法、および文書検索プログラム
WO2012115254A1 (fr) Dispositif de recherche, procédé de recherche, programme de recherche et support de mémoire lisible par un ordinateur destiné à contenir un programme de recherche
WO2017065891A1 (fr) Détection de jonctions automatisée
KR101823463B1 (ko) 연구자 검색 서비스 제공 장치 및 그 방법
US20160179857A1 (en) Database joins using uncertain criteria
JP5980723B2 (ja) 緊急時ノウハウ参照支援方法、緊急時ノウハウ参照支援装置、及び緊急時ノウハウ参照支援プログラム
US10275497B2 (en) Electronic whiteboard system, search result display method of electronic whiteboard, and non-transitory computer readable medium storing program thereof
JP5743938B2 (ja) 連想検索システム、連想検索サーバ及びプログラム
US20160092459A1 (en) Translating a keyword search into a structured query

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13833613

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13833613

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP