WO2014010082A1 - 検索装置、検索装置の制御方法及び記録媒体 - Google Patents
検索装置、検索装置の制御方法及び記録媒体 Download PDFInfo
- Publication number
- WO2014010082A1 WO2014010082A1 PCT/JP2012/067942 JP2012067942W WO2014010082A1 WO 2014010082 A1 WO2014010082 A1 WO 2014010082A1 JP 2012067942 W JP2012067942 W JP 2012067942W WO 2014010082 A1 WO2014010082 A1 WO 2014010082A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- metadata
- file
- name
- schema
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
- G06F16/1824—Distributed file systems implemented using Network-attached Storage [NAS] architecture
- G06F16/183—Provision of network file services by network file servers, e.g. by using NFS, CIFS
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/84—Mapping; Conversion
- G06F16/86—Mapping to a database
Definitions
- the present invention relates to a search server control method that provides a function of searching a file group stored in a file storage system.
- the search server analyzes file data stored in the computer system and creates a search index in advance.
- the user can transmit a search query for searching for a file to be acquired to the search server, and access the target file based on the search result returned by the search server.
- the metadata search service extracts data consisting of a set of metadata names and metadata values included in the search target file, and creates a search index for the data in advance.
- the user can acquire the search result by specifying a search condition related to the metadata name and metadata value to the search server.
- definition information such as what kind of metadata should be indexed and searched based on definition information on a data search target data format and data schema in advance. It is necessary to register.
- the definition information requires information such as a metadata name for identifying the name of the metadata and a metadata format for defining values and data structures that the metadata can take.
- the data search target for metadata search is not limited to one type, and there are many cases where multiple data formats are targeted. In such a case, it is necessary to register definition information regarding metadata in each data format in the search server.
- the search server side includes definition information for multiple data formats and mapping definition information for integrated management of multiple data formats.
- Technology to manage is known.
- the content is disclosed by patent document 1, for example.
- metadata names in a plurality of data formats can be expressed in a unified notation.
- the search server can access the metadata based on a unified notation method, so that it is possible to perform indexing and search using the unified notation.
- search server manages the metadata name in a unified notation, so when the user uses the search service, either the original metadata name is used, or the unified used by the search server This is because there is no choice but to use the metadata name shown.
- the present invention relates to a search device that includes a processor and a memory to search for data, and the search device manages a metadata schema that defines a structure of a search target file including metadata.
- Management information search index schema management information for managing a search index schema definition that defines a structure of search index data, schema mapping management information for managing a correspondence relationship between the metadata schema definition information and the search index schema definition,
- a search control unit that accepts a search request and extracts a file that matches the search request with reference to the schema mapping management information and the search index management information, and the metadata schema management information includes the metadata Namespec that identifies the schema definition
- the search index schema definition includes a field name of the file to be searched, the schema mapping management information includes a correspondence relationship between the metadata name and the field name, and the search
- the control unit extracts at least one of the alias and the metadata name from the search request, refers to the metadata schema management information, converts the alias to a metadata name, and refers to the schema mapping management information Then,
- the search device can easily identify a file having specific metadata, and provides a search service with a file having desired metadata. It becomes possible to do.
- FIG. 1 is a block diagram illustrating an example of a computer system according to a first embodiment of this invention.
- FIG. It is a block diagram which shows a 1st Example of this invention and shows an example of the functional part of a management server. It is a block diagram which shows a 1st Example of this invention and shows an example of the functional part of a file server. It is a block diagram which shows a 1st Example of this invention and shows an example of the functional part of a management machine. It is a block diagram which shows a 1st Example of this invention and shows an example of the functional part of a client machine. It is explanatory drawing which shows the 1st Example of this invention and illustrates the flow of a series of processes implemented by this invention.
- FIG. 10 is a first half of a flowchart illustrating an example of search index update processing according to the second embodiment of this invention. It is a second half part of the flowchart which shows the 2nd Example of this invention and shows an example of the update process of a search index. It is explanatory drawing which shows the 3rd Example of this invention and illustrates a search index schema management table. It is the first half of the flowchart which shows the 3rd Example of this invention and shows an example of the registration process of search index schema definition.
- FIG. 10 is a block diagram of a computer system according to a sixth embodiment of this invention. It is a block diagram which shows the 6th Example of this invention and shows an example of the functional part of a metadata management server. It is the first half of the flowchart which shows a 6th Example of this invention and shows an example of the update process of a search index. It is a second half part of the flowchart which shows a 6th Example of this invention and shows an example of the update process of a search index. It is a flowchart which shows the 6th Example of this invention and shows an example of a file access process.
- FIG. 1 is a block diagram showing an example of a computer system according to the first embodiment of this invention.
- update of the schema definition of the search index in the search server 1100 will be described.
- a search server 1100 that provides a metadata search function as well as a full-text search function enables customization of metadata search targets. As a result, it is possible to perform a search for metadata of a new data format or a search for custom metadata uniquely expanded in a known data format.
- FIG. 1 is a block diagram illustrating the configuration of a computer system in an embodiment of the present invention.
- a search server 1100, a file server 2100, a management machine 3100, and a client machine 4100 are connected via the network 100.
- the search server 1100 provides a file search service for files stored in the file server 2100.
- This file search service provides a full-text search function by specifying a search keyword and a metadata search function in which a search target metadata name and search conditions are specified.
- the file server 2100 receives a file access request from a user and provides a file sharing service.
- the management machine 3100 is a machine used by a system administrator to maintain and manage the search server 1100 and the file server 2100.
- the client machine 4100 can receive an input from the user by the input device 4171 and make a search request to the search server 1100 or make a file access request to the file server 2100.
- the user can search for a file stored in the file server 2100 using the search server 1100.
- each component is shown one by one, but this is not a limitation. If possible, the system may be composed of a plurality of components. Moreover, in FIG. 1, although each component is described as another apparatus, it is not this limitation. If possible, the configuration may be such that any two or more components are implemented as a single device. Further, the connection form by the network 100 may be any network form, for example, an internet connection or an intranet connection by a local area network.
- FIG. 2 is a block diagram illustrating the hardware configuration and functional parts of the search server 1100.
- the search server 1100 includes a processor 1110 that executes a program, a memory 1120 that temporarily stores the program and data, an external storage device I / F 1130 for accessing the external storage device 1160, and other devices connected to the network 100.
- a network I / F 1140 for accessing the device and a bus 1150 for connecting them are configured.
- the memory 1120 includes an external storage device I / F control program 1121 that is a program for controlling the external storage device I / F 1130, a network I / F control program 1122 that is a program for controlling the network I / F 1140, and the search server
- a data control program 1123 that provides a file system or database used to manage stored data in 1100
- a search control program (search control unit) 1124 for providing indexing and search services in the search server 1100 and the like Metadata schema management table 7100, schema mapping management table 7200, search index schema management table 7300, search index management table 7400, search index used by the search control program 1124 Index registration file management table 7500 is stored.
- the search control program 1124 includes a search index schema control subprogram (search index schema control unit) 1171, a file access control subprogram (file access control unit) 1172, an indexing control (indexing control unit) subprogram 1173, A search response control (search response control unit) subprogram 1174 is included.
- the search index schema control subprogram 1171 manages the search index schema definition used for the file search service provided by the search server 1100.
- the search index schema definition defines how the search target file is indexed. For example, for full-text search, all text data can be indexed by dividing it into tokens of an arbitrary length, or specific metadata name and value pairs can be indexed. The specific search index schema definition will be described later.
- the file access control subprogram 1172 performs processing in which the search server acquires data and metadata of the file stored in the file server.
- the indexing control subprogram 1173 analyzes the data and metadata of the index update target file, and performs a process of reflecting it in the search index managed by the search server 1100 for the search service. Specifically, the data and metadata of the index update target file acquired by the file access control subprogram 1172 are analyzed and reflected in the search index management table 7400 and the search index registration file management table 7500 managed by the search server. Let
- the search response control subprogram 1174 receives a search request from the user, uses the search index management table 7400 and the search index registration file management table 7500 of the self search server, and generates and provides a search result. Do.
- the metadata schema management table 7100, the schema mapping management table 7200, the search index schema management table 7300, the search index management table 7400, and the search index registration file management table 7500 will be described later.
- the processor 1110 operates as a functional unit that realizes a predetermined function by executing each program loaded in the memory 1120.
- the processor 1110 functions as a search control unit by operating according to the search control program 1124, and functions as a data management unit by executing the data control program 1123. The same applies to other programs.
- the processor 1110 also operates as a functional unit that implements each of a plurality of processes executed by each program.
- a computer and a computer system are an apparatus and a system including these functional units.
- Information such as programs and tables for realizing each function of the search server 1100 is stored in an external storage device 1160, a nonvolatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or an IC card, SD card, DVD Etc., and can be stored in a computer readable non-transitory data storage medium.
- an external storage device 1160 a nonvolatile semiconductor memory
- a hard disk drive such as an SSD (Solid State Drive), or an IC card, SD card, DVD Etc.
- FIG. 3 is a block diagram illustrating the hardware configuration and functional parts of the file server 2100.
- the file server 2100 includes a processor 2110 that executes programs, a memory 2120 that temporarily stores programs and data, an external storage device I / F 2130 for accessing the external storage device 2160, and other devices connected via a network.
- Network I / F 2140 for accessing the network and a bus 2150 for connecting them.
- the memory 2120 includes an external storage device I / F control program 2121 that is a program for controlling the external storage device I / F 2130, a network I / F control program 2122 that is a program for controlling the network I / F 2140, and the file server Sharing control for providing a file control service 2123 for providing a file system or database used for managing storage data in the database and a file sharing service for storing the file in the file server and sharing it among a plurality of users A program 2124 is stored.
- I / F control program 2121 that is a program for controlling the external storage device I / F 2130
- a network I / F control program 2122 that is a program for controlling the network I / F 2140
- the file server Sharing control for providing a file control service 2123 for providing a file system or database used for managing storage data in the database and a file sharing service for storing the file in the file server and sharing it among a plurality of users
- a program 2124 is stored.
- the processor 2110 operates as a functional unit that realizes a predetermined function by executing each program loaded in the memory 2120.
- the processor 2110 functions as a file sharing control unit by executing the file sharing control program 2124.
- each program or table is stored in an external storage device 2160, a nonvolatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or a computer such as an IC card, SD card, or DVD. It can be stored in a readable non-transitory data storage medium.
- FIG. 4 is an explanatory diagram illustrating the hardware configuration of the management machine 3100.
- the management machine 3100 includes a processor 3110 for executing a program, a memory 3120 for temporarily storing programs and data, an external storage device I / F 3130 for accessing the external storage device 3160, and other devices connected via a network.
- Network I / F 3140 for accessing the network and a bus 3150 for connecting them.
- the management machine 3100 includes an input device 3171 and an output device 3172 (console or management screen), and is connected to the bus 3150 via an I / O interface (I / F). An input from the system administrator is received by the input device 3171 and a response received from the search server 1100 or the like is output to the output device 3172.
- I / O interface I / F
- the memory 3120 includes an external storage device I / F control program 3121 that is a program for controlling the external storage device I / F 3130, a network I / F control program 3122 that is a program for controlling the network I / F 3140, and the management machine
- the search server management client control program 3124 used for managing the search server 1100 is stored.
- a file server management client control program used for managing the file server 2100 from the management machine may be stored.
- search server management client control program 3124 corresponds to a management client program provided by the search server 1100 to be managed or a program that provides a function according to the specification provided by the search server.
- a form using a Web application program for a search server may be used, or a form using a general-purpose Web browser may be used.
- the processor 3110 operates as a functional unit that realizes a predetermined function by executing each program loaded in the memory 3120.
- the processor 3110 functions as a search server management unit by executing the search server management client control program 3124.
- each program and table is stored in an external storage device 3160, a nonvolatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or a computer such as an IC card, an SD card, or a DVD. It can be stored in a readable non-transitory data storage medium.
- FIG. 5 is an explanatory diagram illustrating the hardware configuration of the client machine 4100.
- the client machine 4100 includes a processor 4110 for executing a program, a memory 4120 for temporarily storing programs and data, an external storage device I / F 4130 for accessing the external storage device 4160, and other devices connected via a network.
- Network I / F 4140 for accessing the network and a bus 4150 for connecting them.
- the client machine 4100 includes an input device 4171 and an output device 4172 (console or management screen), and is connected to the bus 4150 via an I / O interface (I / F) 4170.
- An input from the user is received by the input device 4171, and a response received from the search server 1100 or the like is output to the output device 4172.
- the memory 4120 includes an external storage device I / F control program 4121 that is a program for controlling the external storage device I / F 4130, a network I / F control program 4122 that is a program for controlling the network I / F 4140, and the client machine.
- a data control program 4123 that provides a file system or database used to manage storage data in 4100
- a search client control program 4124 used to access the search server 1100 from the client machine 4100
- the client machine 4100 Stores a file sharing client control program 4125 used to access a file shared in the file storage.
- search client control program 4124 corresponds to a client program provided by the search server 1100 to be used or a program that provides a function in accordance with specifications provided by the server.
- a form using a Web application program for a search server may be used, or a form using a general-purpose Web browser may be used.
- FIG. 6 is an explanatory diagram illustrating a flow of a series of processes performed by the computer system of the present invention.
- a metadata schema registration process (process (1-n)) from the management machine 3100 to the search server 1100
- a schema mapping definition registration process process (2-n)
- a management Search index schema definition registration process process (3-n)) from the machine 3100 to the search server 1100
- search index update process process (4-n)
- the system administrator uses the search server management client control program 3124 of the management machine 3100 to send a registration request for the metadata schema definition file (7000 in FIG. 7) to the search server 1100 (processing (1- 1)).
- the metadata schema definition file is a file that defines a structure of a target file for performing metadata search, and defines identification information called a namespace in the structure.
- the system administrator specifies an alias for the namespace defined in the metadata schema definition file.
- the name space can be easily identified uniquely within the search server 1100. Details of the metadata schema definition file will be described later.
- the search server 1100 extracts the metadata schema definition from the received metadata schema definition file, asks the system administrator of the management machine 3100 that is the request source to check the extracted contents as necessary, and further if necessary. After updating the extracted content, the metadata schema definition is stored in the metadata schema management table 7100 (process (1-2)). The above is a series of flow of the metadata schema registration process.
- the system administrator uses the search server management client control program 3124 of the management machine 3100 to call up a schema mapping definition registration screen to the search server 1100 (processing (2-1)).
- the system administrator designates a namespace alias for specifying the metadata schema definition registered in the search server 1100 in process (1-2).
- the search server 1100 acquires the corresponding metadata schema definition from the metadata schema management table 7100 from the alias of the specified namespace, generates a schema mapping information candidate based on the information, and sets the candidate as a request source. Is presented (processing (2-2)).
- the schema mapping information generated by the search server 1100 includes a field name of a search index corresponding to each metadata name in the metadata schema and a data type of each field.
- the system administrator checks the candidate for schema mapping information presented from the search server 1100 on the management machine 3100, and sends update information for the contents of the candidate to the search server 1100 as necessary (processing (2-3)). ).
- the search server 1100 generates final schema mapping information based on the candidate schema mapping information and the update information from the request source (management computer), and stores it in the schema mapping management table 7200 (process (2- 4)).
- the above is a series of flow of schema mapping definition registration processing.
- search index schema definition registration process (3-n) will be described.
- the system administrator uses the search server management client control program 3124 of the management machine 3100 to call the search index schema registration screen for the search server 1100 (processing (3-1)).
- the system administrator specifies the alias of the namespace used in the above process (2-1) in order to use it for registration of the search index schema definition.
- the search server 1100 acquires the corresponding schema mapping information from the specified namespace alias from the schema mapping management table 7200, generates a search index schema definition candidate based on the information, and generates a request source (management machine 3100). ) Present the contents of the candidate (process (3-2)).
- the search index schema definition generated by the search server 1100 includes a field name in the schema mapping definition and a data type of each field.
- the system administrator confirms the content presented on the output device 3172 of the management machine 3100, and transmits update information for the candidate content to the search server 1100 as necessary (processing (3-3)).
- the search server 1100 generates a final search index schema definition based on the search index schema definition candidates and update information from the request source (management machine 3100), and stores it in the search index schema management table 7300 ( Process (3-4)).
- search index schema definition search index schema information
- search index schema definition registration process a series of flows of the search index schema definition registration process.
- the search control program 1124 on the search server 1100 performs file access to the file server 2100, identifies a file that needs to be updated in the search index, and reads a file that needs to be updated (processing (4-1) )).
- the file server 2100 uses the file system 2170 managed by the file server 2100 as needed, acquires information on the target file, and provides it to the request source (for example, the search server 1100) (processing ( 4-2)).
- the search control program 1124 that acquired the update target file specifies the file type of the file, acquires schema definition information that matches the file type from the search index schema management table 7300, analyzes the file, and then searches the search index. Information necessary for data generation is extracted (process (4-3)).
- the search control program 1124 generates search index data based on the generated information and reflects it in the search index management table 7400 and the search index registration file management table 7500 (processing (4-4)).
- the above is a series of search index update processing by the search server 1100.
- the user uses the search client control program 4124 of the client machine 4100 to transmit a file search request to the search server 1100 (process (5-1)).
- the search control program 1124 on the search server 1100 is used in the search index schema of the search server 1100 for the metadata name specified in the query statement specified as the search condition.
- the field name is converted (process (5-2)).
- the search server 1100 acquires information of the schema mapping management table 7200 and uses it for converting the field name. After this conversion, the search control program 1124 uses the search index management table 7400 and the search index registration file management table 7500 to extract files that match the specified search conditions, summarizes the search results, and returns the request source ( To the client machine 4100) (process (5-3)). When collecting the search results, the search server 1100 performs processing for returning the metadata name converted to the field name of the search index schema to the original metadata name. The above is a series of flow of the file search process.
- FIG. 7 is a diagram exemplifying a configuration of a schema definition file that the system administrator registers with the search server 1100 in the metadata schema registration process.
- the schema definition file 7000 shown here shows the contents of a file in accordance with the definition of W3C XML Schema as an example.
- the schema definition file 7000 shown here declares a URI consisting of a character string “http://AAA.com/ProductA/v2” as a namespace as a namespace name.
- this file defines an XML document having child elements “integer type“ PatientID ”, string type“ PatientName ”, and integer type“ VendorPatientID ”as metadata. is doing.
- FIG. 8 is a diagram illustrating a configuration of the metadata schema management table 7100 managed on the search server 1100.
- the metadata schema management table 7100 manages schema definition information (metadata schema definition) used by a metadata search target file registered in the search server 1100.
- the metadata schema management table 7100 includes configuration information of an ID 7110, a namespace alias 7120, a namespace name 7130, a metadata name 7140, and a metadata type 7150.
- ID 7110 stores an identification number mechanically assigned to each record in this table.
- the namespace alias 7120 requests a request source (for example, management) for the namespace defined in the schema definition file. An alias given separately by a system administrator who uses the machine 3100) is stored. The namespace alias 7120 is set so as to maintain uniqueness within the search server 1100.
- the namespace name 7130 stores a name indicating a namespace defined in the schema definition file 7000 when the schema definition file 7000 describing the metadata schema definition is registered.
- the namespace name 7120 is kept unique within the search server 1100. When an XML namespace is used as the namespace name 7120, a uniquely identifiable URI is used.
- the metadata name 7140 stores the metadata name extracted from the schema definition file 7000.
- the search server 1100 may extract an element name or attribute name corresponding to the XML as the metadata name 7140 and store it in the entry in the XPath format.
- the metadata type 7150 stores information related to the data type (“type” in FIG. 7) associated with the metadata name extracted from the schema definition file 7000.
- FIG. 9 is a diagram illustrating a configuration of the schema mapping management table 7200 managed on the search server 1100.
- the schema mapping management table 7200 manages the correspondence between the metadata schema definition in the metadata schema management table 7100 described above and the search index schema definition in the search index schema management table 7300 described later.
- the schema mapping management table 7200 includes configuration information such as a mapping ID 7210, a namespace name 7220, a metadata name 7230, a field name 7240, a field type 7250, and a field alias 7260.
- Mapping ID 7210 stores an identification number mechanically assigned to each record in this table.
- the name space name 7220 stores the same information as the information stored in the name space name 7130 column of the metadata schema management table 7100.
- the metadata name 7230 stores the same information as the information stored in the metadata name 7140 column of the metadata schema management table 7100.
- the field name 7240 stores a name associated with the metadata name stored in the metadata name 7230 column in the search index schema definition. This field name 7240 maintains uniqueness within the search server 1100. For example, as shown in FIG. 9, as individually added field names, a character string (CUSTOMMETA_) indicating individual extension, a character string (A2) indicating a namespace alias, and a character string indicating a metadata name You may use the character string which connected (ProductA / ).
- the field type 7250 stores information defining the type of data stored in the field.
- the same data stored in the metadata type 7150 of the metadata schema management table 7100 may be stored, or may be set by the system administrator.
- the data types specified here include an integer type indicating a numeric type, a string type indicating a character string type, and a text type that allows a keyword search by dividing a character string into tokens.
- the alignment of values is Data types that do not allow storage size to be reduced can be used.
- the field alias 7260 stores an alias of the field name 7240.
- the field alias 7260 may be kept unique within the name space to be used, or may be kept unique within the search server 1100.
- the field alias 7260 is set so as to maintain uniqueness in the name space will be described.
- the field alias 7260 may be set mechanically in the search server 1100, or may be set by a system administrator. By using this field alias 7260, when performing a metadata search in which a field name is specified for the search server 1100, a field name 7240 that may have a long character string is used to guarantee uniqueness. Instead, a field alias 7260 made up of a short-length character string may be used.
- Metadata 7230 For metadata names 7230 that are known not to be indexed in advance, there is no correspondence in the fields of field name 7240, field type 7250, and field alias 7260 to be associated with the metadata name 7230. May be stored. For example, as shown in the figure, the character string “ ⁇ NONE>” may be stored in the sense that there is nothing to be associated.
- FIG. 10 is a diagram illustrating a configuration of a search index schema management table 7300 managed on the search server 1100.
- the search index schema management table 7300 manages schema information that defines the structure of index data used when the search server 1100 provides a search service.
- the search server 1100 is based on the information defined in the search index schema management table 7300, and an index necessary for realizing a search function such as a full-text search using keywords or a metadata search specifying a metadata name and search conditions. Create data.
- the search index schema management table 7300 includes constituent elements such as a field name 7310 and a field type 7320.
- the field name 7310 stores the same information as the information stored in the field name 7240 column of the schema mapping management table 7200 described above. However, among the names existing in the field name 7240 column of the schema mapping management table 7200, there may be a name that does not exist in the field name 7310 column of the search index schema management table 7300.
- the determination of whether or not to perform indexing can be set in units of field names 7310.
- FIGS. 11A and 11B are diagrams illustrating the configuration of a search index management table 7400 managed on the search server 1100.
- the search index management table 7400 manages information on search indexes created by the search server 1100.
- the search index management table 7400 includes configuration information such as a field name 7410, a field value 7420, and corresponding position information 7430.
- the field name 7410 stores the same information as the information stored in the field name 7310 column of the search index schema management table 7400 described above.
- the field value 7420 stores an object (numerical value, character string, etc.) obtained by analyzing the search target file by the indexing process for each field specified by the field name 7410. In the example shown in the figure, when the field name 7410 is “filename”, a character string including the file name of the search target file is stored.
- Corresponding position information 7430 registers information of a file in which an object registered in the field value field 7420 exists.
- the corresponding position information 7430 includes file identification information 7431 and 7434, corresponding position offsets 7432 and 7435, and weights 7433 and 7436.
- file identification information 7431 and 7434 information for identifying a file in which the object stored in the field value 7420 appears is registered.
- Corresponding position offsets 7432 and 7435 register offset information at which the object appears in the file. In this field, when the object appears in a plurality of locations in one file, a plurality of offset information is registered.
- the weights 7433 and 7436 register importance values due to the appearance of the object at the offset of the file.
- the importance value is set by the search server 1100 as appropriate. This value means that the larger the value, the more important. Also, this value can be used for narrowing down and arranging search results.
- a plurality of corresponding position information 7430 can be registered for one field value 7420.
- a null value indicating that the value of the corresponding entry is invalid can be registered in the field of the corresponding position information 7430. This can be used for the entry in which the item is vacant when the number of registrations is smaller than other entries in the column of the corresponding position information 7430, or the entry where the information of the corresponding position offsets 7432 and 7435 is unnecessary. it can.
- FIG. 12 is a diagram illustrating a configuration of a search index registration file management table 7500 managed on the search server 1100.
- the search index registration file management table 7500 manages information related to files acquired by the search server 1100 from file sharing on the file server 2100 that is a search index creation target.
- the search index registration file management table 7500 includes components such as file identification information 7510, a file path name 7520, and metadata 7530.
- the file identification information 7510 is an identifier for uniquely identifying a file acquired by the search server 1100 for creating a search index.
- the identifier may be a serial number assigned by the search server 1100 or a serial number assigned by the file server 2100 in which the file is stored.
- a character string that can be used for identification may be used.
- the file path name 7520 corresponds to the file path name where the target file is stored. Accordingly, the search server 1100 can acquire the file by specifying the file path name 7520 and transmitting a file acquisition request to the file server 2100.
- this search index registration file management table 7500 By using this search index registration file management table 7500, when the search control program 1124 responds to a search request from the user, it is only necessary to use the search index management table 7400 to determine whether or not the search condition is met. In many cases, information necessary for accessing the target file can be acquired from the search index registration management table 7500 as appropriate only for the file that matches the condition.
- Metadata schema registration processing (FIG. 13), schema mapping definition registration processing (FIG. 15), search index schema definition registration processing (FIG. 17), search index update processing (FIG. 19), and file Search processing (FIG. 20) will be described.
- operation screens a metadata schema registration screen (FIG. 14), a schema mapping definition registration screen (FIG. 16), a search index schema definition registration screen (FIG. 18), and a file search screen (FIGS. 21 and 22). explain.
- FIG. 13 is a flowchart showing an example of a metadata schema registration process performed by the search index schema control subprogram 1171 of the search server 1100.
- the system administrator registers a schema definition file 7000 describing the metadata schema definition in the search server 1100, extracts the schema definition information of the metadata from the schema definition file 7000, and manages the metadata schema. Processing to be registered in the table 7100 is performed.
- the processor 1110 that executes the search index schema control subprogram 1171 controls the search server management client on the management machine 3100 on the schema definition file 7000 of the search target file and the alias of the namespace defined in the file. Received from the program 3124 (step S101).
- the schema definition file 7000 as shown in FIG. 7 is acquired. Note that any format other than that shown in FIG. 7 may be used as long as schema definition information can be acquired.
- the main body that executes the program is the same as the processor 1110, and therefore the description of the processor 1110 is omitted to simplify the description.
- the search index schema control subprogram 1171 determines whether or not the namespace alias specified in the schema definition file 7000 overlaps with the namespace alias already used in the search server 1100 (step S1001). S102).
- step S102 If the namespace aliases are duplicated (Yes in step S102), an error occurs and the search index schema control subprogram 1171 ends. On the other hand, if the namespace aliases do not overlap (No in step S102), the search index schema control subprogram 1171 extracts the metadata schema definition from the received schema definition file 7000 (step S103).
- Metadata schema definition extracted here information such as the metadata name and the metadata type indicating the data type of the metadata is extracted.
- a desired metadata schema definition may be extracted by using XSLT that converts data in a certain XML format into another XML format.
- the search index schema control subprogram 1171 presents (transmits) the extracted metadata information to the request source (for example, the management machine 3100) (step S104).
- the presented information is transmitted to the search server management client program 3124 on the management machine 3100, and the search server management client program 3124 outputs the output device 3172 (management) so that the system administrator can view the presented information. Output to the screen or management console).
- the search index schema control subprogram 1171 determines whether or not the extracted information needs to be changed as a result of the request source confirming the presentation information (step S105).
- information indicating whether or not a change is necessary may be included in the information returned from the search server management client program 3124 after the presentation information is transmitted. The above determination is performed using this information.
- step S105 If a change is necessary (Yes in step S105), the search index schema control subprogram 1171 acquires change information for the metadata schema definition from the request source (management machine 3100), and the change information is converted to the metadata schema. The definition is reflected (step S106). Thereafter, the processing after step S104 is repeated again.
- search index schema control subprogram 1171 registers the extracted metadata schema definition in the metadata schema management table 7100 and ends the process.
- the schema definition file 7000 is registered in the search server 1100, the contents of the schema definition file 7000 are analyzed, and the analysis result is used has been described.
- the system administrator may directly specify information to be registered in the metadata schema management table 7100 and register it in the management table.
- FIG. 14 shows an example of an operation screen when the above-described metadata schema registration processing is performed from the management screen of the management machine 3100.
- the metadata schema registration screen 8100 displayed on the output device 3172 of the management machine 3100 provides a metadata schema definition file registration field 8110 and a metadata schema management table registration field 8120.
- a namespace alias input field 8111 and a registered file name input field 8112 are provided.
- the designated schema definition file can be registered in the search server 1100 by pressing the Upload button 8113. Further, the user can cancel the file registration process by pressing the Cancel button 8114.
- the metadata schema management table registration field 8120 displays the contents of the metadata schema extracted by the search server 1100 after the above-described Upload button 8113 is pressed.
- the display contents are the same as the entries in the metadata schema management table 7100.
- the target record is specified in the check box field 8126, and the content is corrected by pressing the Edit button 8127.
- the Delete button 8128 the contents are deleted.
- the registration can be registered in the metadata schema management table 7100 by pressing the Register button 8129. It should be noted that the registration process to the metadata schema management table 7100 can be canceled by pressing the Cancel button 8130.
- FIG. 15 shows the flow of schema mapping definition registration processing performed by the processor 1110 that executes the search index schema control subprogram 1171 of the search server 1100. Note that the processor 1110 is omitted as in FIG. 13 to simplify the following description.
- the system administrator who operates the management machine 3100 specifies a namespace alias for specifying the metadata schema definition, acquires the metadata schema definition having the namespace, and obtains the metadata schema definition from the metadata schema definition. After generating information necessary for the schema mapping definition, a process for registering the generated information in the schema mapping management table 7200 is performed.
- the search index schema control subprogram 1171 receives from the search server management client program 3124 on the management machine 3100 the namespace alias for specifying the metadata schema definition for which the mapping definition is to be set (step). S201). Note that any format may be used as long as the metadata schema definition can be specified. For example, the namespace name itself may be used.
- search index schema control subprogram 1171 determines whether or not the namespace alias specified in step S201 is registered in the search server 1100 (step S202).
- the search index schema control subprogram 1171 acquires the metadata schema definition from the metadata schema management table 7100 from the received namespace alias. (Step S203).
- the metadata schema definition is acquired from the same record as the alias for which the namespace alias 7120 is designated among the records registered in the metadata schema management table 7100.
- the search index schema control subprogram 1171 generates schema mapping information candidates from the acquired metadata schema definition (step S204).
- candidates for field name 7240 are generated from namespace name 7130 and metadata name 7140
- candidates for field type 7250 are generated from metadata type 7150
- candidates for field alias 7260 are generated from metadata name 7140.
- the search index schema control subprogram 1171 transmits the generated schema mapping information candidate to the request source (for example, the management machine 3100) and presents it to the output device 3172 (step S205).
- the presentation information is transmitted to the search server management client program 3124 on the management machine 3100, and the search server management client program 3124 can output the presentation information to the system administrator so that the system administrator can view the presentation information 3172. Output to.
- the search index schema control subprogram 1171 determines whether or not the presentation information needs to be changed as a result of the request source confirming the presentation information (step S206).
- information indicating whether or not a change is necessary may be included in the information returned from the search server management client program 3124 after the presentation information is transmitted. The above determination is made using this information.
- step S206 If a change is necessary (Yes in step S206), the search index schema control subprogram 1171 acquires change information for the schema mapping information candidate from the request source, and reflects the change information in the schema mapping information candidate ( Step S207). Thereafter, the processing after step S205 is repeated again.
- search index schema control subprogram 1171 registers the schema mapping information in the schema mapping management table 7200 and ends the process.
- FIG. 16 shows an example of an operation screen when the schema mapping definition registration process described above is performed from the management screen of the management machine 3100 or the like.
- the schema mapping definition registration screen 8200 provides a metadata schema management table call column 8210 and a schema mapping management table registration column 8220.
- a namespace alias input field 8211 is provided in the metadata schema management table call field 8210. After inputting the namespace alias, a call button 8212 is pressed to search the metadata schema management table 7100 for a record having the specified namespace alias and output it to the schema mapping management table registration field 8220 described later. be able to.
- the contents of the record of the metadata schema management table 7100 having the specified namespace alias and the schema generated from these records are included.
- the display content includes an entry (ID 8221, namespace alias 8222, namespace name 8223, metadata name 8224, metadata type 8225) of the metadata schema management table 7100 and entries (field name 8226, Field type 8227, field alias 8228).
- the target record is specified in the check box field 8229, and the content is corrected (updated) by pressing the Edit button 8230.
- the Delete button 8231 the contents are deleted.
- FIG. 17 shows the flow of search index schema definition registration processing performed by the processor 1110 that executes the search index schema control subprogram 1171 of the search server 1100. Note that the description of the processor 1110 is omitted to simplify the description as in FIG.
- the search index schema control subprogram 1171 receives from the search server management client program 3124 on the management machine 3100 a namespace alias for specifying the schema mapping information for registering the search index schema definition (Ste S301).
- a namespace alias for specifying the schema mapping information for registering the search index schema definition.
- any format may be used as long as the schema mapping information can be specified.
- the namespace name itself may be used.
- search index schema control subprogram 1171 determines whether or not the designated namespace alias is registered in the search server 1100 (step S302).
- step S302 If the namespace alias is not registered (No in step S302), an error occurs and the process ends. If registered (Yes in step S302), the search index schema control subprogram 1171 obtains schema mapping information from the metadata schema management table 7100 and the schema mapping management table 7200 from the received namespace alias ( Step S303).
- the metadata schema definition is obtained by selecting the same record as the alias designated by the namespace alias 7120 from the records registered in the metadata schema management table 7100.
- the schema mapping information is acquired by selecting a record in which the namespace name 7220 is the same as the namespace name included in the metadata schema definition from among the records registered in the schema mapping management table 7200.
- the search index schema control subprogram 1171 generates search index schema definition candidates from the acquired schema mapping information (step S304).
- the namespace alias 7120 acquired from the metadata schema management table 7100, the field name 7240 acquired from the schema mapping management table 7200, and the field type 7250 are combined to be candidates for the search index schema definition.
- the search index schema control subprogram 1171 transmits the generated search index schema definition candidate to the request source (for example, the management machine 3100) and outputs it to the output device 3172 (step S305).
- the information to be presented is transmitted to the search server management client program 3124 on the management machine 3100, and the information presented by the search server management client program 3124 is output on the management screen or management console so that the system administrator can view it. Output to the device 3172.
- the search index schema control subprogram 1171 determines whether or not the presentation information needs to be changed as a result of checking the information presented by the request source (step S306). Specifically, among the presented field names, those that are actually registered in the search index schema management table 7300 and those that are not registered are selected by the system administrator, etc., and changed based on the selected contents. Judgment of necessity is performed. Here, information indicating whether or not a change is necessary may be included in the information returned from the search server management client program 3124 after the presentation information is transmitted. The above determination is performed using this information.
- the search index schema control subprogram 1171 acquires change information for the search index schema definition candidate from the request source, and reflects the change information in the search index schema definition candidate. (Step S307). Specifically, the field specified as needing registration in the search index schema management table 7300 is left and the field specified as not required is deleted. Thereafter, the processing after step S305 is repeated again.
- the search index schema control subprogram 1171 registers the search index schema definition candidate in the search index schema management table 7300 and ends the process.
- FIG. 18 shows an example of an operation screen when the search index schema definition registration process described above is performed from the management screen of the management machine 3100 or the like.
- the search index schema definition registration screen 8300 provides a schema mapping management table calling column 8310 and a search index schema management table registration column 8320.
- the schema mapping management table call column 8310 is provided with a namespace alias input column 8311. After inputting the above, by pressing the Call button 8312, a record having the specified namespace alias is searched from the metadata schema management table 7100 and the schema mapping management table 7200, and a search index schema management table registration field to be described later 8320.
- the search index schema management table registration field 8320 After the above-mentioned Call button 8312 is pressed, the contents of the record of the metadata schema management table 7100 having the specified namespace alias and the same namespace name 7130 as the record are stored.
- the search index schema definition candidate generated from the contents of the record of the schema mapping management table 7200 is output.
- the display contents include an entry (namespace alias 8323) in the metadata schema management table 7100, an entry (mapping ID 8321, field name 8324, field type 8325) in the schema mapping management table 7200, and the search index schema management table 7300. This is an entry (field update flag 8322) indicating whether or not to be added.
- the target record is specified in the check box field 8326 and the content is corrected by pressing the Edit button 8327.
- the Delete button 8328 the contents are deleted.
- the registration index schema management table 7300 can be registered by pressing the Register button 8329.
- the Cancel button 8330 the registration process in the search index schema management table 7300 can be canceled.
- search index schema definition registration screen 8300 by providing the field update flag 8322, it is possible to select not to create a search index schema definition for a field that is not a search target. That is, the search control program 1124 generates a search index schema definition if the field update flag 8322 is “Yes”, and does not generate a search index schema definition if the field update flag 8322 is “No”. Thereby, it is possible to prevent the search index schema management table 7300 from being enlarged by excluding fields that are not search targets.
- FIG. 19 is a flowchart showing the flow of search index update processing performed by the indexing control subprogram 1173 of the search server 1100.
- a file on the file server 2100 is accessed using the file access control subprogram 1172 triggered by a search index update start request preset by the search server 1100.
- the processor 1110 that executes the indexing control subprogram 1173 identifies the search index update target file from the file of the file server 2100, and acquires data and metadata related to the search index update target file from the file server 2100.
- the processor 1110 that executes the indexing control subprogram 1173 updates the search index based on the acquired data, and reflects the update result in the search index management table 7400 and the search index registration file management table 7500. Note that description of the processor 1110 that executes the program is omitted as in FIG.
- the indexing control subprogram 1173 determines whether or not the search index update process is performed by differential indexing (step S401).
- the search index update request is issued, information indicating whether to perform differential indexing or full indexing may also be specified, and determination may be made based on the specified information.
- the indexing control subprogram 1173 determines whether or not all search target files stored in the file server 2100 have been crawled (step S402).
- the indexing control subprogram 1173 selects any of the files not selected in the current crawling among the files stored in the file server 2100. One file is selected (step S403).
- the indexing control subprogram 1173 adds the target file name to the search index update target list (step S404).
- the file name of the file selected in step S403 is added to the list.
- the process returns to step S402 and is repeated until the file names of all target files are added to the search index update target list.
- the target file list can be acquired separately, it may be used as it is.
- the indexing control subprogram 1173 updates the search index for the files described in the search index update target list (step) S405).
- the indexing control subprogram 1173 acquires the data and metadata of the target file from the file server 2100, and specifies the type of the file described in the search index update target list.
- the indexing control subprogram 1173 acquires the metadata schema definition to be indexed by the specified type and the information to be indexed from the metadata schema management table 7100, the schema mapping management table 7200, and the search index schema management table 7300.
- the indexing control subprogram 1173 extracts contents to be indexed from the target file based on the acquired information, and reflects the extracted contents in the search index management table 7400 and the search index registration file management table 7500. This completes the full indexing process.
- step S401 when differential indexing is performed (Yes in step S401), processing described below is performed.
- the indexing control subprogram 1173 determines whether or not all search target files stored in the file server 2100 have been crawled (step S406).
- the indexing control subprogram 1173 selects any of the files not selected in the current crawling from among the files stored in the file server 2100. One file is selected (step S407).
- the indexing control subprogram 1173 refers to the last update date / time information of the selected target file, and determines whether it is newer than the previous index update date / time (step S408).
- step S408 When the last update date and time of the file is newer than the previous index update date and time (Yes in step S408), the indexing control subprogram 1173 determines that the target file is an update target file of the search index, and the search index update target list In addition, the file name of the file is added (step S409). Thereafter, the process returns to step S406, and the above processing is repeated for all target files.
- the indexing control subprogram 1173 determines that the search index of the target file does not need to be updated, and returns to step S406. . Thereafter, the above process is repeated for all target files in the same manner.
- the indexing control subprogram 1173 updates the search index for the files listed in the search index update target list (step) S405).
- the differential indexing process ends.
- the update of the search index is completed by the full indexing process or the differential indexing process.
- FIG. 20 is a flowchart showing a flow of a file search process using a search index in the search server 1100.
- a file search process performed by the processor 1110 that executes the search response control subprogram 1174 on the search server 1100 will be described as an example.
- a search process when a search request is transmitted from the search client control program 4124 on the client machine 4100 to the search server 1100 will be described. This process is started when the search control program 1124 receives a search request.
- the description of the processor 1110 is omitted to simplify the description as in FIG.
- the search response control subprogram 1174 analyzes the content of the search request received from the search request source (client machine 4100), and acquires the search conditions specified in the search request (step S501).
- the search response control subprogram 1174 determines whether or not the search field name needs to be converted for the search condition specified in the search request (step S502).
- the search condition specified in the search condition
- the field alias 7260 and the namespace name 7220 of the schema mapping management table 7200 are designated as search conditions, or the field alias 7260 of the schema mapping management table 7200 and the namespace alias 7120 of the metadata schema management table 7100 are designated. If it is, you cannot search directly, so convert it to a searchable field name. Therefore, if the field name 7310 of the search index schema management table 7300 is specified as the search condition, it is determined that conversion is not necessary, and otherwise conversion is determined.
- the search response control subprogram 1174 performs field name conversion based on registration information in the metadata schema management table 7100 and the schema mapping management table 7200.
- Step S503 Specifically, when the field alias 7260 and the namespace name 7220 of the schema mapping management table 7200 are specified as the search conditions, they are converted to the field name 7240 of the record.
- the field alias 7260 of the schema mapping management table 7200 and the namespace alias 7120 of the metadata schema management table 7100 are specified as search conditions, by acquiring the namespace name 7130 for the namespace alias 7120, Similarly to the above, the field name 7240 corresponding to the namespace name 7220 is converted. Then, the process proceeds to step S504.
- the search response control subprogram 1174 refers to the search index management table 7400, identifies the record that matches the search condition, and sets the record as the record.
- the stored file identification information 7431, 7434, etc. are acquired (step S504).
- the search response control subprogram 1174 refers to the search index registration file management table 7500, and acquires the file path name 7520 of the target file from the file identification information 7431 and 7434 acquired in the previous process (step S505). .
- search response control subprogram 1174 summarizes the search results based on the acquired information, responds to the request source, and ends the process (step S506).
- the search server 1100 when the search server 1100 receives a search request from the client machine 4100, it returns a file path that matches the search conditions. At this time, if the search request requires conversion of field names, the search server 1100 converts the field names from the metadata schema management table 7100 and the schema mapping management table 7200, so that the file path that matches the search conditions Can be searched.
- FIG. 21 shows an example of an operation screen when the file search process described above is performed from the screen of the output device 4172 of the client machine 4100 used by the search user.
- the file search screen 8400 provides a search condition input field 8410 and a search result output field 8420.
- the search condition input field 8410 provides an input field for inputting various search conditions.
- any format may be used as long as a combination of a plurality of conditions by a logical expression can be input as a search condition.
- the search condition described here corresponds to a combination of a search target field name and a value or a range of values stored in the field specified by the field.
- a namespace name of the field name or an alias of the namespace name may be added.
- a field alias may be specified instead of the field name.
- a search condition is shown for searching for a case where the namespace alias is “A2” and the field value specified by the field name “PatientID” is “1000”. .
- the search button 8411 When the search button 8411 is pressed after inputting the above search condition, the specified search condition is transmitted from the client machine 4100 to the search server 1100, and the search is executed.
- search result output column 8420 After the search button 8411 described above is pressed, the search server 1100 searches the specified search condition.
- a list output 8430 is output in which a file matching the search condition is output in a list format as a search result.
- the search result output format is output in the list output format by pressing the List button 8421, and is output in the table output format described later by pressing the Table button 8422. Note that any output format other than the list output format and the table output format may be specified.
- information on field names and metadata values regarding the metadata of each file may be output. You may set which field is output or in what format.
- a search result alignment condition input field 8423 is provided to specify which field is to be aligned, and to specify which is to be aligned in descending order and ascending order, respectively.
- an ASC button 8424 and a DESC button 8425 are provided. After inputting the field name in the alignment condition input field 8423, pressing the ASC button 8424 outputs the data sorted in ascending order using the value of the field. By inputting a field name in the alignment condition input field 8423 and then pressing the DESC button 8425, an output in descending order using the value of the field is output.
- FIG. 22 shows an example in which the search result is in a table output format as an example of an operation screen when the above-described file search process is performed from the screen for the search user of the client machine 4100.
- the screen configuration is almost the same as in FIG. Hereinafter, only differences from FIG. 21 will be described.
- the search result output field 8420 is a table output 8440 in which a file matching the search condition is output in a table format as a search result.
- table output 8440 When outputting in the table output format, information on field names and metadata values related to the metadata of each file is output as each column of the table. You may set which field is output or in what format.
- the search server 1100 can easily identify a file having specific metadata even if the data format of the search target data is different. Accordingly, a file having a plurality of types of metadata can be stored in the file server 2100 and the search server 1100 can provide a search service.
- the search server 1100 when the search server 1100 first updates the search index management table after updating the fields of the search index schema management table 7300, the search server 1100 stores the file corresponding to the updated field. Data can be taken into the search index management table 7400. That is, in the search server 1100, after adding a desired search index schema definition to the search index schema management table 7300, a file having a metadata name corresponding to the added field name is specified as an update target. The search server 1100 can update the search index management table 7400 and the search index registration file management table 7500 by indexing the file specified as the update target.
- a search index when a search index is first updated after updating a field of the search index schema management table 7300 of the search server 1100, data related to the update field can be retrieved into the search index management table 7400 and searched.
- the search index is updated by full indexing after the field is updated, the update field information regarding all the search target files can be acquired without omission.
- the search server 1100 when updating a search index by differential indexing, when a file having an update field is not updated on the file server 2100, the search server 1100 cannot determine the file as an update target file of the search index. Data relating to the update field cannot be taken into the search index management table 7400. Specifically, this corresponds to the case where the file is updated before the previous search index update date and time. For this reason, even if a field is added to the search index schema management table 7300, information on the field of the file cannot be reflected in the search index management table 7400 depending on the last update date and time of the search target file.
- the search server 1100 when the search server 1100 performs a search index update for a file stored in the file server 2100, it is determined whether there is a field update in the search index schema management table 7300 in the local search server 1100. To do. When there is a field update, a file search is performed using the search index management table 7400 managed by the local search server 1100 using the field name as a keyword.
- the difference indexing control that can be handled as will be described as a second embodiment.
- the difference between the second embodiment and the first embodiment is that a part of the search index schema management table 7300, the search index schema definition registration process, and the search index update process are different. This is the same as in the first embodiment.
- FIG. 23 is an explanatory diagram illustrating a flow of a series of processes performed in the computer system according to the second embodiment.
- the schema mapping registration process (process (6-n)) from the management machine 3100 to the search server 1100 and the search index update process (process (7-n)) from the search server 1100 to the file on the file server 2100 are performed.
- Each of these processes will be described. Note that the processing (6-n) and the processing (7-n) correspond to the processing (3-n) and the processing (4-n) in FIG. Processing (1-n), processing (2-n) and processing (5-n) in FIG. 6 are the same processing flow in the second embodiment. For this reason, redundant description is omitted.
- the system administrator uses the search server management client control program 3124 of the management machine 3100 to call the search index schema registration screen 8300 (see FIG. 18) with respect to the search server 1100 (process (6-1)).
- the system administrator designates the namespace alias used in the process (2-n) of the first embodiment for use in registration of the search index schema.
- the search server 1100 acquires the corresponding schema mapping information from the specified namespace alias from the schema mapping management table 7200, generates a search index schema definition candidate based on the information, and generates a request source (management machine 3100). ) Presents the contents of the candidate (process (6-2)).
- the search index schema definition generated here includes the field name in the schema mapping definition and the data type of each field.
- the system administrator confirms the presented content and sends update information for the candidate content to the search server 1100 as necessary (processing (6-3)).
- the search server 1100 generates a final search index schema definition based on the search index schema definition candidates and the update information from the request source, and stores it in the search index schema management table 7300 (processing (6-4)). )).
- processing (6-4) information related to the last update date and time of the record is newly stored for each record in the search index schema management table 7300.
- the search index schema definition in the search server 1100 can be updated, new searchable metadata can be added, and it can be determined when a field corresponding to the metadata has been added.
- the above is a series of flows of the search index schema definition registration process.
- search index update processing (7-n) will be described.
- the search control program 1124 on the search server 1100 performs file access to the file server 2100, identifies the file that needs to be updated in the search index, and reads out the file that needs to be updated (processing (7-1) )).
- the file server 2100 uses the file system 2170 managed by the file server 2100 as necessary, acquires the information of the target file, and transmits it to the request source (management machine 3100) (processing (7-- 2)).
- the search control program 1124 that has acquired the update target file specifies the file type of the file. Then, after obtaining schema definition information matching the file type from the search index schema management table 7300, the file is analyzed to extract information necessary for generating search index data (processing (7-3)). .
- the search control program 1124 generates search index data based on the generated information and reflects it in the search index management table 7400 and the search index registration file management table 7500 (process (7-4)). Thereafter, the search control program 1124 acquires the latest schema definition update date and time newly registered in each record of the search index schema management table 7300, and acquires a list of field names of records that are newer than the previous search index update date and time (processing). (7-5)).
- the search control program 1124 specifies the acquired field name as a search keyword, and searches for a file including the search keyword (field name) using the search index management table 7400 and the search index registration file management table 7500 (processing) (7-6)).
- the search control program 1124 on the search server 1100 accesses the file server 2100 based on the file list acquired as a search result, and reads the target file (process (7-7)).
- the file server 2100 uses the file system 2170 managed by the file server 2100 as necessary to acquire information on the target file and provide it to the request source (process (7-8)).
- the search control program 1124 that acquired the further update target file specifies the file type of the file, acquires schema definition information that matches the file type from the search index schema management table 7300, and then analyzes and searches the file. Information necessary for generating index data is extracted (process (7-9)).
- the search control program 1124 generates search index data based on the generated information and reflects it in the search index management table 7400 and the search index registration file management table 7500 (processing (7-10)).
- the above is a series of flow of search index update processing by the search server 1100.
- the update field is identified, the file having the field name is acquired using the keyword search function, and the search index is updated by differential indexing.
- the configuration of the search index schema management table 7300, the search index schema definition registration process, and a part of the search index update process are changed. These changes will be described with reference to FIGS. 24, 25, and 26.
- FIG. 24 shows the change contents of the search index schema management table 7300 described in FIG. 10 of the first embodiment.
- the configuration of this management table is different from that of the first embodiment in that the schema definition last update date and time 7330 is newly stored in each record as compared to the configuration described in FIG.
- This schema definition last update date and time 7330 stores date and time information when the record is added to the search index schema management table 7300 and when the setting contents of the record are updated.
- the configuration is the same as that of FIG.
- FIG. 25 shows the change contents of the search index schema definition registration process described in FIG. 17 of the first embodiment.
- this flowchart shows the final schema definition shown in FIG. 24 as new information when registering search index schema definition candidates in the search index schema management table 7300. The difference is that information related to the update date 7330 is additionally registered. Specifically, it will be described below.
- step S309 is newly added after step S308.
- step S308 is newly added after step S308.
- the search index schema control subprogram 1171 registers the current date and time information in the schema definition last update date and time 7330 for the newly added record and update record of the search index schema management table 7300 ( Step S309). As a result, it becomes possible to know when the record is updated. This information is used in search index update processing described later.
- FIGS. 26A and 26B are flowcharts in which a part of the search index update process described in FIG. 19 of the first embodiment is changed. Compared with the search index update processing described with reference to FIG. 19, this flowchart acquires the schema definition last update date 7330 newly registered in the search index schema management table 7300, and the update date is the previous search index update date. Also, a field name 7310 stored in a new record is acquired, a file search is performed using the field name 7310 as a keyword, and a hit file is also handled as a search index update target file. Specifically, it will be described below.
- step S401 is the addition of step S410 prior to step S406 of FIG. 19, and the step of FIG. 26B.
- the process in the case of Yes in S406 is that step S411 is newly added prior to step S405 in FIG. The rest is the same as FIG. Only the changed part will be described below.
- the indexing control subprogram 1173 acquires the field names 7310 of all records in which the information of the schema definition last update date 7330 of the search index schema management table 7300 is newer than the previous search index update date. (Step S410).
- the field name 7310 acquired here is used for specifying the search index update target file in later processing.
- the metadata name 7230 associated with the field name is also acquired from the schema mapping management table 7200. In the subsequent processing, when a search is performed using the field name 7310, the search is also performed using the metadata name 7230.
- the indexing control subprogram 1173 uses the search index management table 7400, and the field name 7310 acquired in step S410 and the metadata name 7230 associated with the field name Full-text search for files containing the same string.
- the indexing control subprogram 1173 adds a file name not registered in the search index update target list from the hits as a search result to the list (step S411).
- the file name of the hit file as a search result is acquired from the search index registration file management table 7500.
- the differential indexing based on the last update date and time of the file shown in the first embodiment it is possible to list files that were difficult to extract as search index update target files, and update the search index. be able to.
- the full-text search function in which the field name is specified as a keyword is used to prevent the search index update target file from being missed.
- a simple full-text search function a file that stores the same character string as the field name as an arbitrary field value is also listed as a target file. This can avoid omission of acquisition of the update target file of the search index, but causes a problem that a file that is not an update target is also acquired as an update target.
- the search server 1100 indexes all metadata names associated with the search target file, the metadata name is also indexed.
- the search server 1100 updates the search index for the file stored in the file server 2100, it is determined whether or not there is a field update in the search index schema management table 7300 in the local search server. If the field is updated, the field name is designated as a search condition, a metadata search is performed using the search index management table 7400 managed by the local search server 1100, and the metadata search is hit
- a difference indexing control capable of handling a file as a search index update target file will be described as a third embodiment.
- the search server 1100 After updating the field of the search index schema management table 7300, the search server 1100 identifies the updated field, extracts a file having the field name using the metadata search function, and performs differential indexing.
- a part of the field definition of the search index schema management table 7300, the search index schema definition update process, and the search index update process is changed. These changes will be described with reference to FIGS. 27, 28A, 28B, 29A, and 29B.
- FIG. 27 is obtained by changing a part of the search index schema management table 7300 described in FIG. 24 of the second embodiment.
- the change contents of are shown.
- the configuration of this management table is different from the configuration described with reference to FIG. 24 in that a record having a field name 7310 of “MetadataName” is added.
- This record means that a character string of the metadata name associated with the search target file is extracted and indexed.
- the character string set as the field name 7310 of the newly added record here may be an arbitrary character string.
- the field type 7320 of the record to be added here is an array of character strings or a list of character strings. This is because, when a plurality of metadata are associated with a file, all metadata names are indexed. Other than this, the configuration is the same as that of FIG.
- 28A and 28B are flowcharts in which a part of the search index schema definition registration process described in FIG. 25 of the second embodiment is changed. This flowchart is different from the search index schema definition registration processing described in FIG. 25 in that a field for indexing a metadata name is added to the search index schema management table 7300. Specifically, it will be described below.
- step S310 is newly added after step S308 in FIG. 28B.
- step S310 is newly added after step S308 in FIG. 28B.
- step S308 is newly added after step S308 in FIG. 28B.
- the search index schema control subprogram 1171 registers a field for indexing the metadata name in the search index schema management table 7300 (step S310).
- the search server 1100 registers a record having a field name assigned for metadata name indexing in the search index schema management table 7300.
- the next set of metadata names of the search target file can be indexed and searched when the search index is updated next time. This information is used in search index update processing described later.
- the search index schema management table 7300 if a metadata name indexing record has already been registered, the process proceeds to the next process without doing anything.
- FIGS. 29A and 29B are flowcharts in which a part of the search index update process described in FIGS. 26A and 26B of the second embodiment is changed. Compared with the search index update processing described in FIG. 26A and FIG. 26B, this flowchart performs a metadata search using the metadata name indexing field of the search index schema management table 7300, and searches for hit files. The difference is that it is handled as an index update target file. Specifically, it will be described below.
- step S411 of FIG. 26B is changed to step S412.
- step S405 in FIG. 26B is changed to step 413.
- the rest is the same as FIG. 26A and FIG. 26B.
- the indexing control subprogram 1173 uses the indexing field of the metadata name in the search index management table and uses the same character string as the field name and the associated metadata name. A file included as a data name is searched, and a file name that is not registered in the search index update target list is added to the list from hits as a search result (step S412).
- the file name of the hit file as a search result is acquired from the search index registration file management table 7500. This process reduces the possibility that files that do not need to be updated for the search index are listed for files added to the search index update target list as compared to the case of using full-text search as in the second embodiment. Can do.
- step S412 the indexing control subprogram 1173 updates the search index for the files described in the search index update target list instead of step S405.
- step S413 all metadata names of the target file are extracted and indexed.
- the target file is specified by searching the field name 7310 and the associated metadata name 7230 for the file to be placed in the search index update target list in step S412 described above. Can do.
- the field name is indexed and the metadata search function using the field name is used to prevent the search index update target file from being overlooked.
- the amount of data in the management table 7400 may increase, and the processing performance related to the search function that should be provided by the original search service may deteriorate. Can do.
- a metadata name management table that indicates in which file each metadata name 7230 associated with the field name 7310 of the search index schema management table 7300 exists is newly introduced into the search server 1100, and An example of control for updating a search index by differential indexing using the data name management table 7600 will be described as a fourth embodiment.
- FIG. 30 is a block diagram in which a part of the configuration of the search server 1100 described in FIG. 2 of the first embodiment is changed.
- This configuration is different from the configuration described with reference to FIG. 2 in that a metadata name management table 7600 is newly added on the memory 1120.
- the metadata name management table 7600 will be described later.
- the configuration is the same as that of FIG.
- FIG. 31 shows the structure of the metadata name management table 7600.
- a metadata name included in a search target file is extracted, and a file including a certain metadata name is specified based on the extracted result of the metadata name. Manage the information you need.
- the metadata name management table 7600 includes information of a metadata ID 7610, a metadata name 7620, and a file list 7630.
- the metadata ID 7610 is for uniquely identifying the metadata name. This is an identification number mechanically assigned to each record of the metadata name management table 7600.
- the metadata name 7620 stores a character string of the metadata name included in the target file.
- the file list 7630 stores a list of information for identifying files having the metadata name 7620 of the record. For example, the file path name of the target file may be stored, the URL of the target file may be stored, or the same file identification information 7510 of the search index storage file management table 7500 may be stored.
- 32A and 32B are obtained by changing a part of the flowchart of the search index update process described with reference to FIGS. 26A and 26B.
- this flowchart uses the metadata name management table 7600 to register the metadata name extracted from the target file in the metadata name management table 7600.
- the search index update target file can be specified. Specifically, it will be described below.
- FIG. 26A and FIG. 26B the changes from FIG. 26A and FIG. 26B are cases where the processing after step S404 shown in FIG. 32A, the processing after step S409 shown in FIG. 32B, and No in step S408.
- 26A and 26B a new process of step S414 is added, and step S411 of FIG. 26B is changed to step S415. The rest is the same as FIG. 26A and FIG. 26B. Only the changed part will be described below.
- step S404 After step S404, after step S409, or in the case of No in step S408, the indexing control subprogram 1173 extracts the metadata name from the target file, and the extracted information is stored in the metadata name management table 7600. (Steps S414 and S414A). This process may be performed as part of the indexing process for each target file.
- the indexing control subprogram 1173 uses the metadata name management table 7600 instead of step S411, and includes the same character string as the field name and the associated metadata name. And the file name that is not registered in the search index update target list is added to the list from the hits as a search result (step S415). Here, the file name of the file hit as the search result is further converted as necessary. For example, when the same information as the file identification information 7510 in the search index registration file management table 7500 is stored in the file list 7630 of the metadata name management table 7600, the indexing control subprogram 1173 displays the search index registration file management table 7500. Is used to separately acquire the file path name associated with the file identification information 7510. By this process, the search index update target file can be specified with the same accuracy as in the third embodiment without indexing the metadata name in the search index management table 7400 as in the third embodiment.
- the search index update target files can be narrowed down with the same degree of accuracy as in the third embodiment without increasing the amount of data stored in the search index management table 7400 compared to the third embodiment. Can do.
- a new metadata name management table 7600 indicating in which file each metadata name 7230 associated with the field name 7310 of the search index schema management table 7300 exists is newly introduced into the search server 1100. Then, the differential indexing process using the metadata name management table 7600 is shown.
- the search server 1100 in a system in which the amount of data that can be stored in the storage is not so large, it may be difficult to store the data of the metadata name management table 7600 in the search server 1100.
- this metadata name management table 7600 is newly introduced into the file server 2100 in which the search target file is stored, and the search server 1100 updates the search index by differential indexing using the metadata name management table 7600. This control will be described as a fifth embodiment.
- FIG. 33 is a block diagram in which a part of the configuration of the file server 2100 described in FIG. 3 is changed.
- This configuration is different from the configuration described in FIG. 3 of the first embodiment in that a metadata name extraction control program 2125 and a metadata name management table 7600 are newly added on the memory 2120.
- the metadata name extraction control program 2125 will be described later.
- the metadata name management table 7600 is the same as that described in FIG. 31 of the fourth embodiment. Other than this, the configuration is the same as that of the first embodiment shown in FIG.
- FIG. 34 is a flowchart showing an example of file access processing in the file sharing service provided by the file sharing control program 2124 of the file server 2100. In this process, various file access processes requested from the client (search server 1100 or client machine 4100) of the file sharing service are performed.
- the file sharing control program 2124 cooperates with the metadata name extraction control program 2125 to analyze the contents of the target file, extract the metadata name, and reflect it in the metadata name management table 7600. Perform a new process.
- the metadata name extraction control program 2125 may extract a character string corresponding to the metadata name from the target file using a known or well-known character string analysis program.
- the processor 2120 that executes the file sharing control program 2124 determines whether the file access process requested by the client is a new file creation process or an update process (step S601). Here, the determination is made based on the processing type information in the file access request.
- step S601 If it is a new file creation process or an update process (Yes in step S601), the file sharing control program 2124 extracts the metadata name from the target file, and the extracted metadata name is stored in the metadata name management table 7600. Reflect (step S602). Here, the metadata name extraction is performed in cooperation with the metadata name extraction control program 2125. Thereafter, the file sharing control program 2124 performs the designated file access process (step S603) and ends the process.
- the file sharing control program 2124 determines whether the file access process requested by the client is a file deletion process (step S604). If it is a file deletion process (Yes in step S604), the file sharing control program 2124 deletes information regarding the deletion target file from the metadata name management table 7600 (step S605). Thereafter, the file sharing control program 2124 deletes the target file as the designated file access process (step S606), and ends the process.
- step S604 If it is not a file deletion process (No in step S604), the file sharing control program 2124 performs a file access process designated by the client (step S607) and ends the process.
- 35A and 35B are flowcharts in which a part of the search index update process described in FIGS. 26A and 26B of the second embodiment is changed. Compared with the search index update processing described with reference to FIGS. 26A and 26B, this flowchart performs a search using the metadata name management table 7600 on the file server 2100, and files hit by the search are also searched index update target files. The difference is handled as. Specifically, it will be described below.
- the indexing control subprogram 1173 uses the metadata name management table 7600 on the file server 2100 instead of step S411, and uses the field name acquired in step S410 and the associated field name.
- a file including the same character string as the metadata name is searched, and a file name that is not registered in the search index update target list from the hits as a search result is added to the list (step S416).
- the file path name stored in the file list column 7630 of the metadata name management table 7600 is used for the file name of the file hit as the search result.
- Example 5 described above shows an example in which the metadata name management table 7600 is managed by the file server 2100.
- the metadata name management table 7600 may be managed by a server different from the search server 1100 and the file server 2100.
- the metadata name management table 7600 is stored in the metadata name. It can be considered to manage with a management server.
- this metadata name management table 7600 is newly introduced into an arbitrary server (hereinafter referred to as the metadata management server 5100), and the search server 1100 and the file server 2100 update the metadata name management table 7600.
- the search server 1100 updates the search index by differential indexing using the metadata name management table 7600 will be described as a sixth embodiment.
- the metadata name management table 7600 is introduced into the metadata management server 5100 other than the search server 1100 or the file server 2100, and the search index is updated by differential indexing using the metadata name management table 7600.
- the search index update process, and part of the file access process in the file server 2100 a metadata management server 5100 is newly added.
- FIG. 36 shows a block diagram in which a part of the computer system configuration described in FIG. This configuration is different from the first embodiment in that a metadata management server 5100 is added as compared with the configuration described in FIG.
- the metadata management server 5100 will be described later.
- only one metadata management server 5100 is shown in the example of FIG. 36, the present invention is not limited to this.
- a plurality of metadata management servers 5100 may exist. Other than this, the configuration is the same as that of the first embodiment shown in FIG.
- FIG. 37 is a block diagram illustrating the configuration of the metadata management server 5100.
- the metadata management server 5100 is connected to a processor 5110 that executes a program, a memory 5120 that temporarily stores the program and data, and an external storage device I / F 5130 that accesses the external storage device 5160 via a network.
- the network I / F 5140 for accessing other devices and a bus 5150 for connecting them are configured.
- the memory 5120 includes an external storage device I / F control program 5121 that is a program for controlling the external storage device I / F 5130, a network I / F control program 5122 that is a program for controlling the network I / F 5140, and the metadata.
- the management server 5100 stores a data control program 5123 that provides a file system or database used for managing stored data, a metadata name extraction control program 5124, and a metadata name management table 7600.
- 38A and 38B are flowcharts in which a part of the search index update processing described in FIGS. 32A and 32B of the fourth embodiment is changed. Compared with the search index update processing described with reference to FIG. 32, this flowchart registers the metadata name extracted from the target file in the metadata name management table 7600 on the metadata management server 5100, and the metadata management server 5100. The difference is that the search index update target file can be specified using the above metadata name management table 7600. Specifically, it will be described below.
- FIGS. 38A and 38B the changes from FIGS. 32A and 32B are that steps S414 and S414A in FIGS. 32A and 32B are changed to steps S417 and S417A, and step S415 in FIG. 32B is stepped. This is a change to S418. The rest is the same as FIG. Only the changed part will be described below.
- step S404 After step S404, after step S409, or in the case of No in step S408, the indexing control subprogram 1173 extracts the metadata name from the target file instead of step S414, and extracts the extracted information. This is reflected in the metadata name management table 7600 on the metadata management server 5100 (step S417). This process may be performed as part of the indexing process for each target file. This process only needs to be performed when metadata name extraction is performed by the search server 1100, and need not be performed when the metadata name is extracted by the file server 2100.
- step S406 the indexing control subprogram 1173 uses the metadata name management table 7600 on the metadata management server 5100 instead of step S415, and uses the field name and the associated metadata name. A file including the same character string is searched, and a file name that is not registered in the search index update target list is added to the list from hits as a search result (step S418).
- FIG. 39 is a flowchart in which a part of the file access process in the file server 2100 described in FIG. 34 according to the fifth embodiment is changed. Compared with the file access process described in FIG. 34 according to the fifth embodiment, this flowchart reflects the extracted metadata name in the metadata name management table 7600 on the metadata management server 5100 and the file deletion process. The difference is that information related to the deletion target file is deleted from the metadata name management table 7600 on the metadata management server 5100. Specifically, it will be described below.
- step S602 in FIG. 34 is changed to step S608 and step S605 in FIG. 34 is changed to step S609. .
- the rest is the same as FIG. 34 of the fifth embodiment. Only the changed part will be described below.
- step S601 the file sharing control program 2124 extracts the metadata name from the target file instead of step S602, and reflects it in the metadata name management table 7600 on the metadata management server 5100 (step S608). ). This process only needs to be performed when metadata name extraction is performed by the file server 2100, and need not be performed when a metadata name is extracted by the search server 1100.
- step S604 the file sharing control program 2124 deletes information related to the deletion target file from the metadata name management table 7600 on the metadata management server 5100 instead of step S605 (step S609).
- This process only needs to be performed when metadata name extraction is performed by the file server 2100, and need not be performed when a metadata name is extracted by the search server 1100.
- the metadata name management table 7600 is provided by the metadata management server 5100 different from the search server 1100 and the file server 2100, and the search index is updated with the same degree of accuracy as in the third embodiment.
- the target file can be narrowed down.
- the metadata extraction process for registration in the metadata name management table 7600 is performed by the search server 1100 or the file server 2100 is shown.
- This metadata extraction process may be performed by a server other than the search server 1100 and the file server 2100.
- a server other than the search server 1100 and the file server 2100.
- the reverse proxy server may also serve as a metadata extraction server and extract metadata from the target file after having a system configuration that goes through a proxy server.
- a metadata extraction server extracts the metadata name of the target file, and the extracted information is managed as a metadata name.
- a control method stored in Table 7600 will be described as a seventh embodiment.
- the metadata extraction server may extract not only metadata names but also arbitrary information.
- the information may be reflected in an arbitrary management table.
- FIG. 40 is a block diagram obtained by changing the configuration of the computer system described in FIG. 36 according to the sixth embodiment. This configuration is different from the configuration described in FIG. 36 of the sixth embodiment in that a metadata extraction server 6100 is added.
- the metadata extraction server 6100 will be described later. Although only one metadata extraction server 6100 is shown in the illustrated example, this is not restrictive. There may be a plurality of metadata extraction servers 6100. Other than this, the configuration is the same as that of FIG.
- FIG. 41 is an explanatory diagram illustrating the configuration of the metadata extraction server 6100.
- the metadata extraction server 6100 is connected to a processor 6110 that executes a program, a memory 6120 that temporarily stores the program and data, and an external storage device I / F 6130 that accesses the external storage device 6160 via a network. It comprises a network I / F 6140 for accessing other devices and a bus 6150 for connecting them.
- the memory 6120 includes an external storage device I / F control program 6121 that is a program for controlling the external storage device I / F 6130, a network I / F control program 6122 that is a program for controlling the network I / F 6140, and the metadata.
- the extraction server 6100 stores a data control program 6123 that provides a file system or database used for managing storage data, and a metadata name extraction control program 6124.
- the metadata name extraction control program 6124 may extract a character string corresponding to the metadata name from the target file using a known or well-known character string analysis program.
- the metadata extraction server 6100 may manage the metadata name management table 7600.
- FIG. 42 shows the flow of metadata extraction processing provided by the metadata extraction control program 6124 of the metadata extraction server 6100.
- the processor 6110 that executes the metadata extraction control program 6124 acquires a file to be extracted, and performs a metadata extraction process and an extracted metadata output process.
- the processor 6110 that executes the metadata extraction control program 6124 receives a metadata extraction target file (step S701). Note that how the metadata extraction control program 6124 specifies and acquires the extraction target file is not described in detail here because a known or well-known technique may be used.
- the metadata extraction server 6100 may periodically crawl the file system 2170 of the file server 2100 storing the extraction target file, may transmit the update target file from the file server 2100 or the like, By positioning as a reverse proxy server for file access operations, information on the target file may be acquired in each file access operation.
- the metadata extraction control program 6124 extracts metadata from the target file, executes a predetermined process, and outputs an extraction result to a predetermined location (computer) (step S702).
- the search server 1100 may index the metadata set for metadata search by sending the extracted metadata name-value pairs to the search server 1100.
- the output destination may be within the own server 6100 or a remote server. As the output method and output format, it is possible to select a method and format that can be accepted at the output destination.
- the metadata extraction control program 6124 reflects the extracted metadata name to the server (5100) having the metadata name management table 7600 (step S703).
- the metadata name management table 7600 may be an arbitrary server, and may be a local server or a remote server. With this process, when the search server 1100 performs the search index update process by differential indexing, the information stored in the metadata name management table 7600 can be used to specify the search index update target file. .
- the metadata extraction processing can be performed by the metadata extraction server 6100 different from the search server 1100 and the file server 2100. Thereby, it is possible to realize load distribution of the metadata extraction processing.
- the search server 1100 when the search server 1100 performs a search index update process by differential indexing, all the files on the file server 2100 are crawled from the search server 1100 and the last update date and time of the target file is confirmed. , The search index update target file was specified. However, there is also a method in which the file server 2100 side specifies such a search index update target file. Specifically, the file server 2100 stores a file operation history for the stored file. This file operation history is also assigned operation types such as creation, update, deletion, and reference to the file.
- the file server 2100 provides a search service based on the file operation history. Specifically, when the file server 2100 wants a list of files created, updated, and deleted after an arbitrary date and time, a file that matches the search condition as a search result is specified by specifying the search condition. Can provide a list of
- the search server 1100 can use this mechanism, the previous index update date / time is designated as a search condition, a list of files created, updated, or deleted after the date / time is requested to the file server 2100, and the search condition File list can be obtained. If this can be used, it is not necessary to crawling all search target files, so that the differential indexing process on the search server 1100 side can be more efficiently performed.
- the file server 2100 needs a mechanism for storing the file operation history in the file server and searching for the file operation history (hereinafter referred to as “Change File Notification control”).
- the operation history for the file is stored in the file server 2100 so that the operation history can be searched, and after the search server 1100 updates the search index schema management table 7300, the search index by differential indexing is used.
- a control using the search function for specifying the search index update target file when performing the update process will be described as an eighth embodiment.
- FIG. 43 shows a block diagram in which a part of the configuration of the file server 2100 described in FIG. 3 of the first embodiment is changed.
- This configuration is different from the configuration described with reference to FIG. 3 in that a change file notification control program 2126 and a file update list management table 7700 are newly added on the memory 1120.
- the Change File Notification control program 2126 and the file update list management table 7700 will be described later.
- the configuration is the same as that of the first embodiment shown in FIG.
- FIG. 44 is a diagram illustrating a configuration of the file update list management table 7700 managed on the file server 2100.
- the file update list management table 7700 is changed or deleted when a new file is created, a file is updated, or a file is deleted by a request from the user (client machine 4100) in the file server 2100. Event information is recorded and managed.
- the file update list management table 7700 includes components such as an occurrence date and time 7710, an operation type 7720, an object type 7730, and a path name 7740.
- the occurrence date and time 7710 stores information related to the date and time when an event related to creation, update, and deletion occurred.
- the operation type 7720 stores information related to the event type. Specifically, types such as creation, update, and deletion are registered. In addition, about an update here, you may add the information which identifies the object which the update generate
- the object type 7730 stores information on the type for classifying the target in which the event has occurred. Specifically, when a file system is used, a type such as a file or a directory is registered. If you are using a database, register types such as records, columns, and tuples.
- the path name 7740 stores information necessary for accessing the object in which the event has occurred. Specifically, when a file system is used, information such as the path name and node number of the target file may be stored. Also if you are using a database. Information such as the identification record number of the target record may be stored.
- FIG. 45 shows a file update list registration process performed in cooperation with the data control program 2123 and the Change File Notification control program 2126 when a file access is made to the file system managed by the data control program 2123 of the file server 2100. It is a flowchart which shows the flow.
- the Change File Notification control program 2126 constantly monitors the file access operation in the data control program 2123, and performs a predetermined operation as described below when an event requiring registration of the file update list occurs.
- the processor 2110 that executes the data control program 2123 executes a predetermined operation on the file system (step S801). For example, if it is a file creation request, a file with a specified name is created. If it is a file update request, the specified update contents are reflected on the specified file. If it is a file deletion request, the specified file is deleted.
- a file creation request a file with a specified name is created.
- a file update request the specified update contents are reflected on the specified file.
- the specified file deletion request the specified file is deleted.
- step S802 determines whether the operation type for the target file is creation, update, or deletion (step S802). That is, in step S802, it is determined whether the file operation is an event for registering a file update list.
- step S802 If it is determined that the event is to perform file update list registration (Yes in step S802), the data control program 2123 notifies the change file notification control program 2126 and registers the file operation in the file update list management table 7700. (Step S803), the process ends. If it is determined that the event is not to perform file update list registration (No in step S802), the process ends.
- 46A and 46B are flowcharts in which a part of the search index update process described in FIGS. 26A and 26B of the second embodiment is changed. Compared with the search index update processing described in FIGS. 26A and 26B, this flowchart performs a search by using the Change File Notification control program 2126 on the file server 2100 and can specify the update target file of the search index. Different. Specifically, it will be described below.
- step S406, S407, S408, and S409 performed in FIG. 26B performed after step S410 is changed to step S419. Is a point. The rest is the same as FIG. 26A and FIG. 26B. Only the changed part will be described below.
- the indexing control subprogram 1173 executes a file update list inquiry process to the file server 2100 instead of steps S406 to S409 in FIG. 26B (step S419).
- a file update list inquiry process is requested to provide a list composed of newly created, updated, and deleted files after the previous search index update date and time.
- the file update list inquiry process will be described later.
- step S409 of FIG. 26B the indexing control subprogram 1173 adds the file name described in the file list acquired by the file update list query process to the search index update target list.
- FIG. 47 is a flowchart showing the flow of the file update list inquiry process in step S419 in FIG. 46B.
- the indexing control subprogram 1173 determines whether or not the search server 1100 has acquired all the lists of desired index update target files (step S901).
- the index update target list is divided and acquired, it is determined whether or not all elements acquired by division are acquired. If all of them have been acquired (Yes in step S901), this process ends.
- the indexing control subprogram 1173 sends a file update list query request to the file server 2100 together with the file update list acquisition conditions (step S902).
- information about the date and time when the search server 1100 updated the search index last time is designated as an acquisition condition and transmitted.
- the Change File Notification control program 2126 on the file server 2100 that has received the inquiry searches the file update list 7700, and extracts a record that matches the specified acquisition condition (step S903).
- the Change File Notification control program 2126 converts the information about the extracted record into a format that can be processed by the request source, and provides the request to the search server 1100 (step S904). Thereafter, the process returns to step S901, and the above processing is repeated.
- the search server 1100 can efficiently perform the search index update processing by differential indexing without performing all file crawling by the search server 1100.
- the search index update process can be performed efficiently.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
以上、本発明の実施例1について説明したが、本発明はこの実施例1に限定されることなくその趣旨を逸脱しない範囲内で種々の構成をとることができることは言うまでもない。
Claims (18)
- プロセッサとメモリとを備えてデータの検索を行う検索装置であって、
前記検索装置は、
メタデータを含む検索対象のファイルの構造を定義したメタデータスキーマ定義を管理するメタデータスキーマ管理情報と、
検索インデックスデータの構造を定義した検索インデックススキーマ定義を管理する検索インデックススキーマ管理情報と、
前記メタデータスキーマ定義情報と前記検索インデックススキーマ定義との対応関係を管理するスキーママッピング管理情報と、
検索要求を受け付けて前記スキーママッピング管理情報及び前記検索インデックス管理情報を参照して前記検索要求に合致するファイルを抽出する検索制御部と、を備え、
前記メタデータスキーマ管理情報は、
当該メタデータスキーマ定義を識別するネームスペースの別名とメタデータ名を含み、
前記検索インデックススキーマ定義は、
前記検索対象のファイルのフィールド名を含み、
前記スキーママッピング管理情報は、
前記メタデータ名とフィールド名の対応関係を含み、
前記検索制御部は、
前記検索要求から、前記別名とメタデータ名の少なくとも一方を抽出し、
前記メタデータスキーマ管理情報を参照して、前記別名をメタデータ名に変換し、
前記スキーママッピング管理情報を参照して、前記メタデータ名からフィールド名を特定することを特徴とする検索装置。 - 請求項1に記載の検索装置であって、
前記検索インデックススキーマ定義で設定されたフィールド名とファイルの対応関係を保持する検索インデックス管理情報をさらに有し、
前記検索制御部は、
前記検索要求から、前記別名とメタデータ名の少なくとも一方を抽出し、
前記メタデータスキーマ管理情報を参照して、前記別名からメタデータ名を特定し、
前記スキーママッピング管理情報を参照して、前記メタデータ名からフィールド名を特定し、
前記検索インデックス管理情報を参照して、前記特定したフィールド名に合致するファイルを特定することを特徴とする検索装置。 - 請求項2に記載の検索装置であって、
前記検索制御部は、
前記検索インデックススキーマ定義を登録する際に、前記フィールド名毎に前記検索インデックス管理情報を作成するか否かを識別する情報を受け付け、前記作成する情報を受け付けたときに前記フィールド名を前記検索インデックス管理情報に登録することを特徴とする検索装置。 - 請求項1に記載の検索装置であって、
前記メタデータスキーマ管理情報は、
前記検索制御部が一意に識別可能なネームスペースの別名を格納することを特徴とする検索装置。 - 請求項2に記載の検索装置であって、
前記検索インデックススキーマ管理情報は、
前記検索インデックススキーマ定義が更新、作成または削除の何れかの操作が行われた日時を格納する更新日時情報を含み、
前記検索制御部は、
前記更新日時情報の値が所定の条件を満たすフィールド名を特定し、前記特定したフィールド名に対応するファイルを前記検索インデックス管理情報から更新対象ファイルとして特定し、前記特定した更新対象ファイルについて前記検索インデックス管理情報の値を更新する差分インデクシングを実行することを特徴とする検索装置。 - 請求項2に記載の検索装置であって、
前記検索制御部は、
前記検索インデックス管理情報の値を更新するファイルをクローリングにより更新対象ファイルとして特定し、前記特定した更新対象ファイルについて前記検索インデックス管理情報の値を更新することを特徴とする検索装置。 - 請求項2に記載の検索装置であって、
前記検索装置は、前記検索対象のファイルを格納するファイルサーバに接続され、
前記ファイルサーバは、
前記検索対象のファイルに対する操作履歴を蓄積し、前記操作履歴が所定の条件に合致するファイルを更新対象ファイルとして特定し、
前記検索制御部は、
前記ファイルサーバが特定した前記更新対象ファイルについて前記検索インデックス管理情報の値を更新することを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索制御部は、
前記検索インデックススキーマ定義を更新した後に、全ての前記更新対象ファイルについて前記検索インデックス管理情報の値を更新し、前記更新された検索インデックススキーマ定義に対応するファイルの情報を前記検索インデックス管理情報に取り込むことを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索制御部は、
前記検索インデックススキーマ定義を更新した後に、前記更新対象ファイルの全てのメタデータ名を抽出してインデクシングし、前記検索インデックススキーマ定義のフィールド名に関連付けられたメタデータ名と同じ文字列をメタデータ名として含むファイルを検索し、当該検索結果としてヒットしたファイルを更新対象ファイルとして前記検索インデックス管理情報の値を更新し、前記更新された検索インデックススキーマ定義に対応するファイルの情報を前記検索インデックス管理情報に取り込むことを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索インデックススキーマ管理情報のフィールド名に関連付けられたメタデータ名が存在するファイルを格納するメタデータ名管理情報を有し、
前記検索制御部は、
前記検索インデックススキーマ定義を更新した後に、前記検索インデックススキーマ定義のフィールド名に関連付けられたメタデータ名と同じ文字列を含むファイルを前記メタデータ名管理情報から検索し、当該検索結果としてヒットしたファイルを更新対象ファイルとして前記検索インデックス管理情報の値を更新し、前記更新された検索インデックススキーマ定義に対応するファイルの情報を前記検索インデックス管理情報に取り込むことを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索装置は、前記検索対象のファイルを格納するファイルサーバに接続され、
前記ファイルサーバは、
前記検索インデックススキーマ管理情報のフィールド名に関連付けられたメタデータ名が存在するファイルを格納するメタデータ名管理情報を有し、
前記検索制御部は、
前記検索インデックススキーマ定義を更新した後に、前記検索インデックススキーマ定義のフィールド名に関連付けられたメタデータ名と同じ文字列を含むファイルを前記ファイルサーバのメタデータ名管理情報から検索し、当該検索結果としてヒットしたファイルを更新対象ファイルとして前記検索インデックス管理情報の値を更新し、前記更新された検索インデックススキーマ定義に対応するファイルの情報を前記検索インデックス管理情報に取り込むことを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索装置は、メタデータ管理サーバに接続され、
前記メタデータ管理サーバは、
前記検索インデックススキーマ管理情報のフィールド名に関連付けられたメタデータ名が存在するファイルを格納するメタデータ名管理情報を有し、
前記検索制御部は、
前記検索インデックススキーマ定義を更新した後に、前記検索インデックススキーマ定義のフィールド名に関連付けられたメタデータ名と同じ文字列を含むファイルを前記メタデータ管理サーバのメタデータ名管理情報から検索し、当該検索結果としてヒットしたファイルを更新対象ファイルとして前記検索インデックス管理情報の値を更新し、前記更新された検索インデックススキーマ定義に対応するファイルの情報を前記検索インデックス管理情報に取り込むことを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索装置は、前記検索対象のファイルを格納するファイルサーバに接続され、
前記ファイルサーバは、
前記検索インデックススキーマ管理情報のフィールド名に関連付けられたメタデータ名が存在するファイルを抽出することを特徴とする検索装置。 - 請求項5に記載の検索装置であって、
前記検索装置は、メタデータを抽出するメタデータ抽出サーバに接続され、
前記メタデータ抽出サーバは、
前記検索インデックススキーマ管理情報のフィールド名に関連付けられたメタデータ名が存在するファイルを抽出することを特徴とする検索装置。 - プロセッサとメモリとを備えてデータの検索を行う検索装置の制御方法であって、
前記検索装置は、
メタデータを含む検索対象のファイルの構造を定義したメタデータスキーマ定義を識別するネームスペースの別名とメタデータ名を含んで管理するメタデータスキーマ管理情報と、
前記検索対象のファイルのフィールド名を含んで検索インデックスデータの構造を定義した検索インデックススキーマ定義を管理する検索インデックススキーマ管理情報と、
前記メタデータスキーマ定義情報の前記メタデータ名と、前記検索インデックススキーマ定義の前記フィールド名の対応関係を管理するスキーママッピング管理情報と、を有し、
前記検索装置が、検索要求を受け付ける第1のステップと、
前記検索装置が、前記検索要求から、前記別名とメタデータ名の少なくとも一方を抽出する第2のステップと、
前記検索装置が、前記メタデータスキーマ管理情報を参照して、前記別名をメタデータ名に変換する第3のステップと、
前記検索装置が、前記スキーママッピング管理情報を参照して、前記メタデータ名からフィールド名を特定する第4のステップと、
を含むことを特徴とする検索装置の制御方法。 - 請求項15に記載の検索装置の制御方法であって、
前記検索装置は、
検索インデックススキーマ定義で設定されたフィールド名とファイルの対応関係を保持する検索インデックス管理情報をさらに有し、
前記検索装置が、前記検索インデックス管理情報を参照して、前記特定したフィールド名に合致するファイルを特定する第5のステップをさらに含むことを特徴とする検索装置の制御方法。 - プロセッサとメモリとを備えてデータの検索を行う計算機のプログラムを格納する非一時的な記録媒体であって、
前記計算機は、
メタデータを含む検索対象のファイルの構造を定義したメタデータスキーマ定義を識別するネームスペースの別名とメタデータ名を含んで管理するメタデータスキーマ管理情報と、
前記検索対象のファイルのフィールド名を含んで検索インデックスデータの構造を定義した検索インデックススキーマ定義を管理する検索インデックススキーマ管理情報と、
前記メタデータスキーマ定義情報の前記メタデータ名と、前記検索インデックススキーマ定義の前記フィールド名の対応関係を管理するスキーママッピング管理情報と、を有し、
前記プログラムは、
検索要求を受け付ける第1のステップと、
前記検索要求から、前記別名とメタデータ名の少なくとも一方を抽出する第2のステップと、
前記メタデータスキーマ管理情報を参照して、前記別名をメタデータ名に変換する第3のステップと、
前記スキーママッピング管理情報を参照して、前記メタデータ名からフィールド名を特定する第4のステップと、
を前記計算機に実行させることを特徴とする非一時的な記録媒体。 - 請求項17に記載の非一時的な記録媒体であって、
前記計算機は、
検索インデックススキーマ定義で設定されたフィールド名とファイルの対応関係を保持する検索インデックス管理情報をさらに有し、
前記プログラムは、
前記検索インデックス管理情報を参照して、前記特定したフィールド名に合致するファイルを特定する第5のステップをさらに含むことを特徴とする非一時的な記録媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/067942 WO2014010082A1 (ja) | 2012-07-13 | 2012-07-13 | 検索装置、検索装置の制御方法及び記録媒体 |
US14/413,868 US9767108B2 (en) | 2012-07-13 | 2012-07-13 | Retrieval device, method for controlling retrieval device, and recording medium |
JP2014524573A JP5843965B2 (ja) | 2012-07-13 | 2012-07-13 | 検索装置、検索装置の制御方法及び記録媒体 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/067942 WO2014010082A1 (ja) | 2012-07-13 | 2012-07-13 | 検索装置、検索装置の制御方法及び記録媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014010082A1 true WO2014010082A1 (ja) | 2014-01-16 |
Family
ID=49915583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/067942 WO2014010082A1 (ja) | 2012-07-13 | 2012-07-13 | 検索装置、検索装置の制御方法及び記録媒体 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9767108B2 (ja) |
JP (1) | JP5843965B2 (ja) |
WO (1) | WO2014010082A1 (ja) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015210818A (ja) * | 2014-04-24 | 2015-11-24 | キヤノン株式会社 | コンテキスト管理のための装置、システム、および方法 |
JP2017143490A (ja) * | 2016-02-12 | 2017-08-17 | 株式会社東芝 | 情報処理装置、システム、プログラム及び方法 |
JP2017532626A (ja) * | 2014-08-21 | 2017-11-02 | ドロップボックス, インコーポレイテッド | インスタントインデックスのための手法を有するマルチユーザの検索システム |
CN110471888A (zh) * | 2018-05-09 | 2019-11-19 | 株式会社日立制作所 | 一种自动收集数据的方法、装置、介质、设备及系统 |
CN112507187A (zh) * | 2020-11-11 | 2021-03-16 | 贝壳技术有限公司 | 索引变更方法及装置 |
US10977324B2 (en) | 2015-01-30 | 2021-04-13 | Dropbox, Inc. | Personal content item searching system and method |
US11120089B2 (en) | 2015-01-30 | 2021-09-14 | Dropbox, Inc. | Personal content item searching system and method |
CN114840487A (zh) * | 2022-03-25 | 2022-08-02 | 阿里巴巴(中国)有限公司 | 分布式文件系统的元数据管理方法和装置 |
Families Citing this family (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8930331B2 (en) | 2007-02-21 | 2015-01-06 | Palantir Technologies | Providing unique views of data based on changes or rules |
US9383911B2 (en) | 2008-09-15 | 2016-07-05 | Palantir Technologies, Inc. | Modal-less interface enhancements |
US9092482B2 (en) | 2013-03-14 | 2015-07-28 | Palantir Technologies, Inc. | Fair scheduling for mixed-query loads |
US9547693B1 (en) | 2011-06-23 | 2017-01-17 | Palantir Technologies Inc. | Periodic database search manager for multiple data sources |
US8799240B2 (en) | 2011-06-23 | 2014-08-05 | Palantir Technologies, Inc. | System and method for investigating large amounts of data |
US8504542B2 (en) | 2011-09-02 | 2013-08-06 | Palantir Technologies, Inc. | Multi-row transactions |
US20140164573A1 (en) * | 2012-12-12 | 2014-06-12 | Asustek Computer Inc. | Data transmission system and method |
US20140214899A1 (en) * | 2013-01-25 | 2014-07-31 | Hewlett-Packard Development Company, L.P. | Leaf names and relative level indications for file system objects |
US9116975B2 (en) | 2013-10-18 | 2015-08-25 | Palantir Technologies Inc. | Systems and user interfaces for dynamic and interactive simultaneous querying of multiple data stores |
JP6167015B2 (ja) * | 2013-10-30 | 2017-07-19 | 富士通株式会社 | 情報処理システム、管理プログラム、及びインデックス管理方法 |
US20180210911A1 (en) * | 2017-01-23 | 2018-07-26 | Oliver Wendel Gamble | Method and System for Interactive Notation, Text Data Storage and Management on a Mobile Device. |
US10489355B1 (en) * | 2013-11-20 | 2019-11-26 | Progress Software Corporation | Schema tool for non-relational databases |
US9619557B2 (en) | 2014-06-30 | 2017-04-11 | Palantir Technologies, Inc. | Systems and methods for key phrase characterization of documents |
US9535974B1 (en) | 2014-06-30 | 2017-01-03 | Palantir Technologies Inc. | Systems and methods for identifying key phrase clusters within documents |
US9229952B1 (en) | 2014-11-05 | 2016-01-05 | Palantir Technologies, Inc. | History preserving data pipeline system and method |
US9348920B1 (en) | 2014-12-22 | 2016-05-24 | Palantir Technologies Inc. | Concept indexing among database of documents using machine learning techniques |
US10552994B2 (en) | 2014-12-22 | 2020-02-04 | Palantir Technologies Inc. | Systems and interactive user interfaces for dynamic retrieval, analysis, and triage of data items |
US10452651B1 (en) | 2014-12-23 | 2019-10-22 | Palantir Technologies Inc. | Searching charts |
US9817563B1 (en) | 2014-12-29 | 2017-11-14 | Palantir Technologies Inc. | System and method of generating data points from one or more data stores of data items for chart creation and manipulation |
US9672257B2 (en) | 2015-06-05 | 2017-06-06 | Palantir Technologies Inc. | Time-series data storage and processing database system |
US9384203B1 (en) | 2015-06-09 | 2016-07-05 | Palantir Technologies Inc. | Systems and methods for indexing and aggregating data records |
US9996595B2 (en) | 2015-08-03 | 2018-06-12 | Palantir Technologies, Inc. | Providing full data provenance visualization for versioned datasets |
US9576015B1 (en) | 2015-09-09 | 2017-02-21 | Palantir Technologies, Inc. | Domain-specific language for dataset transformations |
US9454564B1 (en) | 2015-09-09 | 2016-09-27 | Palantir Technologies Inc. | Data integrity checks |
US9542446B1 (en) | 2015-12-17 | 2017-01-10 | Palantir Technologies, Inc. | Automatic generation of composite datasets based on hierarchical fields |
US10824605B2 (en) | 2016-03-18 | 2020-11-03 | At&T Intellectual Property I, L.P. | Database metadata and methods to adapt the same |
US10007674B2 (en) | 2016-06-13 | 2018-06-26 | Palantir Technologies Inc. | Data revision control in large-scale data analytic systems |
US10305750B1 (en) * | 2016-07-29 | 2019-05-28 | Juniper Networks, Inc. | Methods and apparatus for centralized configuration management of heterogenous network devices through software-based node unification |
US9753935B1 (en) | 2016-08-02 | 2017-09-05 | Palantir Technologies Inc. | Time-series data storage and processing database system |
US20180074812A1 (en) * | 2016-09-09 | 2018-03-15 | Takuji Kawaguchi | Information terminal, information processing system, method, and recording medium |
US10133588B1 (en) | 2016-10-20 | 2018-11-20 | Palantir Technologies Inc. | Transforming instructions for collaborative updates |
US10318630B1 (en) | 2016-11-21 | 2019-06-11 | Palantir Technologies Inc. | Analysis of large bodies of textual data |
US10884875B2 (en) | 2016-12-15 | 2021-01-05 | Palantir Technologies Inc. | Incremental backup of computer data files |
US10223099B2 (en) | 2016-12-21 | 2019-03-05 | Palantir Technologies Inc. | Systems and methods for peer-to-peer build sharing |
US20220292069A1 (en) * | 2017-01-23 | 2022-09-15 | Oliver Wendel Gamble | Method and System for Enhancement and Cross Relating Messages Received and Stored on a Mobile Device |
US10896097B1 (en) | 2017-05-25 | 2021-01-19 | Palantir Technologies Inc. | Approaches for backup and restoration of integrated databases |
GB201708818D0 (en) | 2017-06-02 | 2017-07-19 | Palantir Technologies Inc | Systems and methods for retrieving and processing data |
US10956406B2 (en) | 2017-06-12 | 2021-03-23 | Palantir Technologies Inc. | Propagated deletion of database records and derived data |
US11334552B2 (en) | 2017-07-31 | 2022-05-17 | Palantir Technologies Inc. | Lightweight redundancy tool for performing transactions |
US10417224B2 (en) | 2017-08-14 | 2019-09-17 | Palantir Technologies Inc. | Time series database processing system |
US10216695B1 (en) | 2017-09-21 | 2019-02-26 | Palantir Technologies Inc. | Database system for time series data storage, processing, and analysis |
US10614069B2 (en) | 2017-12-01 | 2020-04-07 | Palantir Technologies Inc. | Workflow driven database partitioning |
US11281726B2 (en) | 2017-12-01 | 2022-03-22 | Palantir Technologies Inc. | System and methods for faster processor comparisons of visual graph features |
US11016986B2 (en) | 2017-12-04 | 2021-05-25 | Palantir Technologies Inc. | Query-based time-series data display and processing system |
US11934370B1 (en) * | 2017-12-11 | 2024-03-19 | Amazon Technologies, Inc. | Data store indexing engine with automated refresh |
US10754822B1 (en) | 2018-04-18 | 2020-08-25 | Palantir Technologies Inc. | Systems and methods for ontology migration |
GB201807534D0 (en) | 2018-05-09 | 2018-06-20 | Palantir Technologies Inc | Systems and methods for indexing and searching |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001092844A (ja) * | 1999-09-24 | 2001-04-06 | Nippon Telegr & Teleph Corp <Ntt> | 異種情報源問い合わせ変換方法及び装置及び異種情報源問い合わせ変換プログラムを格納した記憶媒体 |
JP2007072526A (ja) * | 2005-09-02 | 2007-03-22 | Fuji Xerox Co Ltd | リポジトリ及びデータ入力装置及びプログラム |
JP2007257083A (ja) * | 2006-03-20 | 2007-10-04 | Fujitsu Ltd | データベース統合参照プログラム、データベース統合参照方法及びデータベース統合参照装置 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000224257A (ja) * | 1999-01-29 | 2000-08-11 | Jisedai Joho Hoso System Kenkyusho:Kk | 送信装置および受信装置 |
US7051019B1 (en) * | 1999-08-17 | 2006-05-23 | Corbis Corporation | Method and system for obtaining images from a database having images that are relevant to indicated text |
US7613993B1 (en) * | 2000-01-21 | 2009-11-03 | International Business Machines Corporation | Prerequisite checking in a system for creating compilations of content |
US20060026567A1 (en) * | 2004-07-27 | 2006-02-02 | Mcvoy Lawrence W | Distribution of data/metadata in a version control system |
US7809763B2 (en) * | 2004-10-15 | 2010-10-05 | Oracle International Corporation | Method(s) for updating database object metadata |
US7716198B2 (en) * | 2004-12-21 | 2010-05-11 | Microsoft Corporation | Ranking search results using feature extraction |
US7340686B2 (en) * | 2005-03-22 | 2008-03-04 | Microsoft Corporation | Operating system program launch menu search |
US7734644B2 (en) * | 2005-05-06 | 2010-06-08 | Seaton Gras | System and method for hierarchical information retrieval from a coded collection of relational data |
US7630999B2 (en) * | 2005-07-15 | 2009-12-08 | Microsoft Corporation | Intelligent container index and search |
JP4468309B2 (ja) * | 2006-01-18 | 2010-05-26 | 株式会社東芝 | メタデータ検索装置、メタデータ検索方法およびメタデータ検索プログラム |
US7739260B1 (en) * | 2006-12-28 | 2010-06-15 | Scientific Components Corporation | Database search system using interpolated data with defined resolution |
US7725454B2 (en) * | 2007-07-20 | 2010-05-25 | Microsoft Corporation | Indexing and searching of information including handler chaining |
US8433697B2 (en) * | 2011-09-10 | 2013-04-30 | Microsoft Corporation | Flexible metadata composition |
US9417796B2 (en) * | 2012-06-29 | 2016-08-16 | M-Files Oy | Method, a server, a system and a computer program product for copying data from a source server to a target server |
-
2012
- 2012-07-13 US US14/413,868 patent/US9767108B2/en not_active Expired - Fee Related
- 2012-07-13 JP JP2014524573A patent/JP5843965B2/ja not_active Expired - Fee Related
- 2012-07-13 WO PCT/JP2012/067942 patent/WO2014010082A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001092844A (ja) * | 1999-09-24 | 2001-04-06 | Nippon Telegr & Teleph Corp <Ntt> | 異種情報源問い合わせ変換方法及び装置及び異種情報源問い合わせ変換プログラムを格納した記憶媒体 |
JP2007072526A (ja) * | 2005-09-02 | 2007-03-22 | Fuji Xerox Co Ltd | リポジトリ及びデータ入力装置及びプログラム |
JP2007257083A (ja) * | 2006-03-20 | 2007-10-04 | Fujitsu Ltd | データベース統合参照プログラム、データベース統合参照方法及びデータベース統合参照装置 |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015210818A (ja) * | 2014-04-24 | 2015-11-24 | キヤノン株式会社 | コンテキスト管理のための装置、システム、および方法 |
US9922092B2 (en) | 2014-04-24 | 2018-03-20 | Canon Kabushiki Kaisha | Devices, systems, and methods for context management |
JP2017532626A (ja) * | 2014-08-21 | 2017-11-02 | ドロップボックス, インコーポレイテッド | インスタントインデックスのための手法を有するマルチユーザの検索システム |
US10579609B2 (en) | 2014-08-21 | 2020-03-03 | Dropbox, Inc. | Multi-user search system with methodology for bypassing instant indexing |
US10817499B2 (en) | 2014-08-21 | 2020-10-27 | Dropbox, Inc. | Multi-user search system with methodology for personal searching |
US10853348B2 (en) | 2014-08-21 | 2020-12-01 | Dropbox, Inc. | Multi-user search system with methodology for personalized search query autocomplete |
US10977324B2 (en) | 2015-01-30 | 2021-04-13 | Dropbox, Inc. | Personal content item searching system and method |
US11120089B2 (en) | 2015-01-30 | 2021-09-14 | Dropbox, Inc. | Personal content item searching system and method |
JP2017143490A (ja) * | 2016-02-12 | 2017-08-17 | 株式会社東芝 | 情報処理装置、システム、プログラム及び方法 |
CN110471888A (zh) * | 2018-05-09 | 2019-11-19 | 株式会社日立制作所 | 一种自动收集数据的方法、装置、介质、设备及系统 |
CN112507187A (zh) * | 2020-11-11 | 2021-03-16 | 贝壳技术有限公司 | 索引变更方法及装置 |
CN114840487A (zh) * | 2022-03-25 | 2022-08-02 | 阿里巴巴(中国)有限公司 | 分布式文件系统的元数据管理方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2014010082A1 (ja) | 2016-06-20 |
US20150213043A1 (en) | 2015-07-30 |
US9767108B2 (en) | 2017-09-19 |
JP5843965B2 (ja) | 2016-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5843965B2 (ja) | 検索装置、検索装置の制御方法及び記録媒体 | |
JP6006267B2 (ja) | 索引キーを使用して検索を絞込むシステムおよび方法 | |
US8396894B2 (en) | Integrated repository of structured and unstructured data | |
CN102483765B (zh) | 文件搜索系统和程序 | |
KR101927450B1 (ko) | 대용량 비정형 데이터를 처리하는 rest api 서비스 제공 방법 | |
JP5557824B2 (ja) | 階層ファイルストレージに対する差分インデクシング方法 | |
US11782921B2 (en) | Columnar cache query using hybrid query execution plan | |
JP5979895B2 (ja) | 文書管理システム、コンピュータプログラム、文書管理方法 | |
JP6586050B2 (ja) | 管理装置、管理方法および管理プログラム | |
JP5172931B2 (ja) | 検索装置、検索方法および検索プログラム | |
EP2690563A1 (en) | Method and device for reporting on data of documents generated from templates | |
US20130218928A1 (en) | Information processing device | |
JP6638053B1 (ja) | ドキュメント作成支援システム | |
JP6710881B1 (ja) | ドキュメント作成支援システム | |
JP7381290B2 (ja) | 計算機システム及びデータの管理方法 | |
JP2016062522A (ja) | データベース管理システム、データベースシステム、データベース管理方法およびデータベース管理プログラム | |
Digles et al. | Accessing the open PHACTS discovery platform with workflow tools | |
JP2013105244A (ja) | 文書管理プログラム、情報処理装置および文書管理方法 | |
US20210141773A1 (en) | Configurable Hyper-Referenced Associative Object Schema | |
JP5542857B2 (ja) | クエリ発行装置、クエリ発行プログラム、クエリ発行方法 | |
JP2008287663A (ja) | リソース管理装置 | |
Freeman et al. | SportsStore: Modifying and Deleting Data | |
JP6463240B2 (ja) | クエリ作成支援方法および情報処理装置 | |
Sharma et al. | Indexing | |
JP2007249551A (ja) | 情報管理方法、プログラム、情報管理装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12881113 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014524573 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14413868 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12881113 Country of ref document: EP Kind code of ref document: A1 |