WO2023188049A1

WO2023188049A1 - Metadata management system, metadata management method, and program

Info

Publication number: WO2023188049A1
Application number: PCT/JP2022/015745
Authority: WO
Inventors: 岳夫荒木
Original assignee: 株式会社Robon
Priority date: 2022-03-30
Filing date: 2022-03-30
Publication date: 2023-10-05
Also published as: JPWO2023188049A1; JP7319475B1

Abstract

Provided is a metadata management system for managing metadata, the metadata management system comprising a general metadata acquisition unit that acquires each general metadata that is metadata defined by a certain metadata schema, and a document metadata creation unit that creates each document metadata from each general metadata. The present invention thereby enables intended metadata to be acquired easily.

Description

Metadata management system, metadata management method, program

The present disclosure relates to a metadata management system, a metadata management method, and a program.

Conventionally, this type of technology is a system that registers the correspondence between data item names and data attributes for data items (metadata) handled during a series of processes in software design and development, and uses them as a shared resource. The data item registration and search method includes a data item dictionary that manages data items that combine purpose of use and attributes under a single name. It has been proposed that when a data item is registered as a data item and the data item is searched, the attribute of the target is uniquely determined by using the name and purpose of use as keys (for example, see Patent Document 1).

Japanese Patent Application Publication No. 5-197533

In order to utilize data, searching for desired data using metadata is considered, but if the metadata schema of each metadata is not unified, it is possible to easily obtain the desired metadata. It was difficult.

The main purpose of the metadata management system, metadata management method, and program of the present disclosure is to make it possible to easily obtain desired metadata.

The metadata management system, metadata management method, and program of the present disclosure employ the following means to achieve the above-mentioned main purpose.

The metadata management system of the present disclosure includes:
A metadata management system that manages metadata,
a general type metadata acquisition unit that acquires each general type metadata, each of which is metadata defined by an arbitrary metadata schema;
a document type metadata creation unit that creates each document type metadata from each of the general type metadata;
The purpose is to have the following.

In the metadata management system of the present disclosure, each general type metadata, each of which is metadata defined by an arbitrary metadata schema, is acquired, and each document type metadata is created from each general type metadata. Therefore, by creating each document type metadata and performing a full-text search on each document type metadata according to the user's instructions, the user can search for the desired document type metadata even if the metadata schema of each metadata is not unified. metadata can be easily obtained. In addition, since each document type metadata is created, by performing natural language processing on each document type metadata, the information in each document type metadata can be supplemented and users can search for similar document type metadata. You can assist. For example, in the case of general-type metadata, it is difficult to perform natural language processing, and it would take a lot of effort for humans to manually complete the information, but by creating document-type metadata, it is difficult to perform natural language processing. , it is possible to easily supplement information by performing natural language processing.

Here, examples of natural language processing include document vector calculation processing, complementation processing, proofreading processing, classification processing, search processing for similar document type metadata, and the like. Further, natural language processing can be performed using deep learning, for example, techniques such as RNN (Recurrent Neural Network) and BERT (Bidirectional Encoder Representations from Transformers).

In the metadata management system of the present disclosure, the document type metadata creation unit may create the document type metadata from the general type metadata having a graph structure. In this case, the document type metadata creation unit may create the document type metadata having an outline structure from the general type metadata having a tree structure among the graph structures. In this case, the general type metadata includes first metadata about a data item holding a single value and second metadata about a set of the data items, and the first metadata The second metadata includes a second common part, a second metadata source dependent part, and a second user defined part. The general type metadata has the second common part as a vertex, and the second metadata source dependent part and the second user defined part as child nodes of the second common part. and a set of the first common parts, the first metadata source dependent part and the first user definition part as child nodes of the first common part, and the document type meta The data has a second common part related part related to the second common part as a chapter, a second metadata source dependent part related part related to the second metadata source dependent part as a section, and a second common part related part related to the second metadata source dependent part as a section; a second user definition section related section that is related to the second user definition section; and a first common section set related section that is related to the set of the first common section; a first metadata source dependent part related part associated with the first metadata source dependent part, and a first user defined part associated with the first user defined part; The user definition section and the related section may also be provided.

In the metadata management system of the present disclosure, the general type metadata acquisition unit acquires each of the general type metadata, and the first item set by the user and the second item proposed and approved by the user. , a predetermined third item may be added.

In the metadata management system of the present disclosure, when a user sets one or more search strings and issues a search instruction, a full text search is performed for each of the document type metadata using the search strings. It may further include a full text search section. By performing a full-text search for each document type metadata using a search string, users can easily obtain the desired metadata even if the metadata schema of each metadata is not unified. can.

The metadata management system of the present disclosure may further include a natural language processing unit that performs natural language processing on the document type metadata, and at least the natural language processing unit may be configured as a multi-tenant system in a cloud environment. good. By performing natural language processing on document-type metadata, it is possible to complement the information in the document-type metadata and assist the user in searching for similar document-type metadata. Further, by configuring the natural language processing unit as a multi-tenant system, natural language processing can be performed with higher precision than when configuring it as a single tenant system.

In this case, the natural language processing unit proposes the calculation result to the user, and when the calculation result is approved by the user, the general type data acquisition unit reflects the calculation result in the general type data. It may also be updated. In this way, the calculation results of natural language processing can be reflected in the general metadata based on the user's intention.

The metadata management method of the present disclosure is
A metadata management method for managing metadata, the method comprising:
(a) obtaining each general type of metadata, each of which is metadata specified by an arbitrary metadata schema;
(b) creating each document type metadata from each of the general type metadata;
The gist is to have the following.

In the metadata management method of the present disclosure, each general type metadata, each of which is metadata defined by an arbitrary metadata schema, is acquired, and each document type metadata is created from each general type metadata. Therefore, by creating each document type metadata and performing a full-text search on each document type metadata according to the user's instructions, the user can search for the desired document type metadata even if the metadata schema of each metadata is not unified. metadata can be easily obtained. In addition, since each document type metadata is created, by performing natural language processing on each document type metadata, the information in each document type metadata can be supplemented and users can search for similar document type metadata. You can assist. For example, in the case of general-type metadata, it is difficult to perform natural language processing, and it would take a lot of effort for humans to manually complete the information, but by creating document-type metadata, it is difficult to perform natural language processing. , it is possible to easily supplement information by performing natural language processing.

The program of this disclosure is
A program for making a computer function as a metadata management system for managing metadata,
(a) obtaining each general type of metadata, each of which is metadata specified by an arbitrary metadata schema;
(b) creating each document type metadata from each of the general type metadata;
The gist is to have the following.

The program of the present disclosure acquires each general type metadata, each of which is metadata defined by an arbitrary metadata schema, and creates each document type metadata from each general type metadata. Therefore, by creating each document type metadata and performing a full-text search on each document type metadata according to the user's instructions, the user can search for the desired document type metadata even if the metadata schema of each metadata is not unified. metadata can be easily obtained. In addition, since each document type metadata is created, by performing natural language processing on each document type metadata, the information in each document type metadata can be supplemented and users can search for similar document type metadata. You can assist. For example, in the case of general-type metadata, it is difficult to perform natural language processing, and it would take a lot of effort for humans to manually complete the information, but by creating document-type metadata, it is difficult to perform natural language processing. , it is possible to easily supplement information by performing natural language processing.

1 is a configuration diagram schematically showing the configuration of a metadata management system 20 as an example of the present disclosure. 3 is a flowchart illustrating an example of metadata acquisition related processing. It is an explanatory diagram showing an example of data. It is an explanatory diagram showing an example of general type metadata. It is an explanatory diagram showing an example of general type metadata. It is an explanatory diagram showing an example of document type metadata. It is an explanatory diagram showing an example of general type metadata. It is an explanatory diagram showing an example of document type metadata. 3 is a flowchart illustrating an example of metadata search processing.

Next, a mode for carrying out the present disclosure will be described using examples.

FIG. 1 is a configuration diagram schematically showing the configuration of a metadata management system 20 as an embodiment of the present disclosure. The metadata management system 20 of the embodiment is configured as a system for managing metadata, and as shown in the figure, includes an on-premises version of the first main management system 32 and a cloud version of the second main management system 42. , a database system 44, a full text search system 46, a natural language processing system 48, and a user authentication system (not shown). Here, each of the first main management systems 32 exists in an on-premises environment 30, such as an environment within a company's intranet, and is configured using a well-known computer. This on-premises environment 30 also includes a user terminal 50 and a metadata source 60. The second main management system 42, database system 44, full text search system 46, natural language processing system 48, and user authentication system all exist in the cloud environment 40, and each uses a cloud service provided by a cloud vendor. It is configured as a multi-tenant system. A metadata source 70 also exists in this cloud environment 40 .

The first main management system 32 sends each general type of metadata from the database 62 and tool 64 of the metadata source 60 to the second main management system 42 . Here, general type metadata is metadata defined by an arbitrary metadata schema. Specifically, the general type metadata includes first metadata about a data item holding a single value and second metadata about a set of data items. The first metadata includes a first common part, a first metadata source dependent part defined when stored in the

metadata sources

60 and 70, and a first metadata source dependent part defined by the user, automatically defined, or predefined. and a first user definition section in which the first user definition section is defined. The second metadata includes a second common part, a second metadata source dependent part defined when stored in the

metadata sources

60 and 70, and a second metadata source dependent part defined by the user, automatically defined, or predefined. and a second user definition section in which the user definition section is defined. The general type metadata is configured as a tree structure, and has a second common part as a vertex, a second metadata source dependent part, a second user defined part, and a second common part as child nodes of the second common part. The first common part has a first metadata source dependent part and a first user defined part as child nodes of the first common part.

The metadata source 60 includes a database 62 and tools 64. Note that in FIG. 1, the metadata source 60 includes one database 62 and one tool 64, but it may include only one of them, or it may include one or more of both. You can also use it as Examples of the tool 64 include ETL (Extract Transform Load). The database 62 and the tool 64 store, for example, each file (data and general metadata attached to the data). The metadata source 60 (database 62 and tools 64) is exposed to the first main management system 32 through an API (Application Programming Interface).

The second main management system 42 stores each general type metadata from the first main management system 32 and the metadata source 70 in the database 44a of the database system 44. The second main management system 42 is exposed to the first main management system 32 and the user terminal 50 via API.

The metadata source 70 includes a database 72 and tools 74. In addition, in FIG. 1, the metadata source 70 includes one database 72 and one tool 74, but it may include only one of them, or it may include one or more of both. You can also use it as As the tool 74, for example, ETL etc. can be mentioned. The database 72 and the tool 74 store, for example, each file (data and general metadata attached to the data). The metadata source 70 (database 72 and tools 74) is exposed to the second main management system 42 via an API.

The database system 44 creates each document type metadata from each general type metadata, and performs various processes in cooperation with the full text search system 46 and the natural language processing system 48. Document type metadata is configured as an outline structure. This document type metadata has, as a chapter, a second common part related part that is related to the second common part of the second metadata of the general type metadata. Further, as a clause, a second metadata source dependent part related part related to the second metadata source dependent part of the second metadata, and a second user definition related to the second user defined part of the second metadata. and a first common part set related part related to a set of first common parts of the first metadata. Further, as a term, there is a first common part related part related to the first common part. In addition, a first metadata source dependent part related part related to the first metadata source dependent part of the first metadata, and a first user defined part related to the first user defined part of the first metadata. It has a definition part and a related part. The database system 44 has a database 44a. For example, each general type metadata is stored in the database 44a. The database system 44 is exposed to the second main management system 42 via an API.

The full-text search system 46 breaks down each document type metadata into document components and creates an index, and performs a full-text search using the index of each document type data and the search string from the user terminal 50. I do it. The full text search system 46 has a database 46a. The database 46a stores, for example, each document type metadata, an index of each document type metadata, calculation results of natural language processing by the natural language processing system 48, and the like. Each document type metadata, the index of each document type metadata, and the calculation result of natural language processing are respectively associated with each other using keys, and the keys are also attached to the general type metadata of the database 44a of the database system 44. associated with the use of The full text search system 46 is exposed to the second main management system 42, database system 44, and natural language processing system 48 via API.

The natural language processing system 48 performs natural language processing on each document type metadata. Here, examples of natural language processing include document vector calculation processing, complementation processing, proofreading processing, classification processing, search processing for similar document type metadata, and the like. Further, natural language processing can be performed using deep learning, for example, techniques such as RNN (Recurrent Neural Network) and BERT (Bidirectional Encoder Representations from Transformers). Natural language system 48 has a database 48a. The natural language processing system 48 is exposed to the second main management system 42 and database system 44 via API.

The user terminal 50 is configured as a desktop computer, a notebook computer, a smartphone, a tablet terminal, etc., and includes a processing section, a storage section, an input section, a display section, and the like. The processing unit is configured as a well-known computer. The storage unit is configured as a hard disk, SSD, or the like. Examples of the input unit include a keyboard, a mouse, and a touch panel. The user terminal 50 is capable of communicating with the first main management system 32 and the second main management system 42.

Next, the operation of the metadata management system 20 of the embodiment will be explained. The following describes user authentication processing when a user performs authentication, metadata acquisition related processing when acquiring general metadata from the

metadata sources

60 and 70, and processing when a user searches for desired metadata. The metadata search process will be explained in order.

First, the user authentication process when a user performs authentication will be explained. This process is executed when the user operates the input section of the user terminal 50 and instructs to display an authentication screen for user authentication. Note that when this instruction is given, the user terminal 50 transmits the instruction to a user authentication system (not shown).

In the user authentication process, first, the user authentication system transmits the data of the authentication screen to the user terminal 50, and displays the authentication screen on the display section of the user terminal 50. Here, the authentication screen is configured so that the user can input the user's ID and password and issue authentication instructions. When the user operates the input unit 76 and inputs the user's ID and password and provides authentication instructions on the authentication screen, and the information is transmitted from the user terminal 50 to the user authentication system, the user The authentication system authenticates the user using the user's ID and password. When the user authentication system completes the user authentication, the user authentication system transmits the main screen data to the user terminal 50 and causes the display unit of the user terminal 50 to display the main screen. Here, the main screen is configured so that execution instructions such as metadata acquisition related processing and metadata search processing can be given.

Next, a description will be given of metadata acquisition related processing when general type metadata is acquired from the

metadata sources

60 and 70. FIG. 2 is a flowchart illustrating an example of metadata acquisition related processing. This process is performed when the user operates the input section of the user terminal 50 to set the

databases

62, 72 and

tools

64, 74 to be acquired in the

metadata sources

60, 70 and issue an acquisition instruction. , the metadata management system 20. The user terminal 50 transmits information on the

databases

62, 72 and

tools

64, 74 to be acquired, and an acquisition instruction to the first and second

main management systems

32, 42. Note that this process may be executed periodically (for example, every few days or every few weeks) once the

databases

62, 72 and

tools

64, 74 to be acquired are set.

In the metadata acquisition-related process in FIG. The information is transmitted to the main management system 42 (step S100). The second main management system 42 acquires each general type of metadata from the database 72 to be acquired in the metadata source 70 and the API of the tool 74, and Each new general type metadata from the tool 74 is merged with each existing general type metadata in the database 44a of the database system 44 and stored in the database 44a using the API of the database system 44 (step S110 ).

Here, each new general type metadata from the

databases

62, 72 and

tools

64, 74 to be acquired may not contain enough information needed by the user. When a user uses data, it is preferable that the name, characteristics, configuration rules, etc. of the data be easily understood by the user. Additionally, due to amendments to the legal system, certain types of data may need to be managed. Based on these considerations, in the embodiment, the second main management system 42 allows each new general-type metadata to be configured by the user before merging each new general-type metadata with each existing general-type metadata. At least one of the set first item, the second item proposed and approved by the user, and the predetermined third item is added as an additional item. For example, the input section of the user terminal 50 is operated by the user, and an item (first item) to be added to the general metadata is set and sent from the user terminal 50 to the second main management system 42. If so, the second main management system 42 adds that item (first item) to each new general type metadata as an additional item. In addition, if there is an item (second item) to be added to the general type metadata that has been set by the user more than a predetermined number of times in the past and is not currently set, the second main management system 42 The item (second item) is proposed to the user by being sent to the user terminal 50 and displayed on its display, and when the user approves, the item (second item) is added to each new general type metadata. Add as an additional item. Note that information on additional items is left blank (unknown).

As an example for explaining the processing of steps S100 and S110, a case where data as shown in FIG. 3 and general type metadata as shown in FIG. 4 are stored in the

databases

62, 72 and

tools

64, 74 to be acquired. think of. Although the general metadata is configured as a tree structure as described above, it is illustrated as a table in FIG. 4 for ease of understanding. When managing data (files), general type metadata is required to register and reference the necessary information, so in the case of data like Figure 3, general type metadata like Figure 4 is required. will be attached.

In the data in Figure 3, the "table name" is "STUDENT_TABLE", the "file name" is "student.dat", and the "ID", "Name", "Kana", "Birthday", " ``Height'' and ``Weight'' are included. In the general type metadata in Figure 4, the "table name" is "STUDENT_TABLE", the "file name" is "student.dat", and the "column names" are "ID", "Name", "Kana", Information on the corresponding "data type" and "number of digits" is included for each of "Birthday", "Height", and "Weight". For example, for "ID" of "column name", "data type" is "integer" and "number of digits" is "18", and for "name" of "column name", "data type" is "integer" and "number of digits" is "18". It is a "character string" and the "number of digits" is "16".

Then, in the process of step S110, at least one of the first item, second item, and third item is added to each new general type metadata as an additional item, so that the additional item is a logical name, a description, If it is the presence or absence of personal information, general type metadata as shown in FIG. 4 is converted to general type metadata as shown in FIG. 5. A black square in FIG. 5 means that the information is blank (unknown). As described above, the general type metadata is configured as a tree structure, and has a second common part as a vertex, and a second metadata source dependent part and a second used part as child nodes of the second common part. The first metadata source dependent part and the first user defined part are provided as child nodes of the first common part. When applied to the general type metadata in FIG. 5, the set of "table name" and "STUDENT_TABLE" corresponds to the second common part. The set of "file name" and "student.dat" corresponds to the second metadata source dependent part. The set of "logical name" and "punched", the set of "description" and "punched", and the set of "personal information" and "information with punched" correspond to the second user-defined section. A "column" corresponds to a set of first common parts, and a set of "column name" and "ID", a set of "column name" and "Name", etc. correspond to the first common part. The set of "data type" and "integer" in the set of "column name" and "ID", the set of "number of digits" and "18", the set of "data type" and The set of "character string", the set of "number of digits" and "16", etc. correspond to the first metadata source dependent part. A set of "column name" and "ID", a set of "logical name" and "information with holes" in the set of "column name" and "Name", a set of "description" and "information with holes", The set of "personal information" and "information with holes" corresponds to the first user definition section.

In the embodiment, in parallel with the processing in steps S100 and S110, the second main management system 42 communicates with the database 62 and tool 64 to be acquired via the first main management system 32, and performs communication with the database 62 and tools 64 to be acquired. When detecting general type metadata that does not exist in the

target database

62, 72 or

tool

64, 74 among existing general type data in the database 44a through communication with the database 72 or tool 74, the general type Along with deleting the metadata, the document type metadata corresponding to the deleted general type metadata, the index of the document type metadata, and the calculation results of natural language processing are also deleted from the database 46a of the full text search system 46. Regarding general type metadata, if it exists in the database 44a but does not exist in the acquisition target, in addition to the case where the general type metadata to be acquired is deleted, the name of the general type metadata to be acquired is changed. It is also possible to consider cases where multiple general type metadata to be acquired are integrated. Further, regarding the deletion of general type metadata, etc., it may be possible to do so without automatically deleting the data completely and only adding a deletion mark, and to completely delete it based on the user's instructions.

When the process of step S110 is thus completed, the second main management system 42 transmits a message to the user terminal 50 to the effect that each new general type metadata is stored in the database 44a, and displays the information on the display screen of the second main management system 42. to be displayed. In this way, the user is notified that each new general type metadata is stored in the database 44a.

Subsequently, the database system 44 creates each document type metadata from each general type metadata in the database 44a, and uses the API of the full text search system 46a to create each document type metadata in the database 46a of the full text search system 46. (Step S120). This process is performed on new general type metadata. Each document type metadata in the database 46a is associated with the general type metadata in the database 44a using a key. As described above, the new general type metadata includes additional items and the information of the additional items is blank, so the document type metadata created by the process of step S120 also does not contain any additional items. , an additional item is included, and the information of the additional item is punched out.

In the case of general type metadata as shown in FIG. 5, the process of step S120 creates document type metadata as shown in FIG. be done. Similar to the black squares in FIG. 5, the black squares in FIG. 6 mean that the information is blank (unknown). As described above, the document type metadata is structured as an outline structure, and has a second common part related part as a chapter, and a second metadata source dependent part related part and a second user defined part as sections. has a section related section and a first common section set related section, has the first common section as a term, and has a first metadata source dependent section related section and a first user defined section related section as eyes. have When applied to the document type metadata in FIG. 6, "table name: STUDENT_TABLE" corresponds to the second common part-related part. "File name: student.dat" corresponds to the second metadata source dependent part related part. "Logical name: information has holes", "description: information has holes", and "personal information: information has holes" correspond to the second user definition section related section. The "column" corresponds to the first common part set related part. "Column name: ID", "Column name: Name", etc. correspond to the first common part related part. "Data type: Integer" and "Number of digits: 18" in "Column name: ID" and "Data type: String" and "Number of digits: 16" in "Column name: Name" are the first metadata sources. Corresponds to the dependent part and related part. "Logical name: Information is blank", "Description: Information is blank", and "Personal information: Information is blank" in "Column name: ID", "Column name: Name", etc. are the first user defined part. Corresponds to the related department.

After the processing in step S120, the database system 44 cooperates with the full-text search system 46 using the API of the full-text search processing system 46, and causes the full-text search system 46 to create an index for each document type metadata (step S130). The full-text search system 46 breaks down each document type metadata in the database 46a into document constituent elements, creates an index, and stores the created index for each document type metadata in the database 46a (step S140). The process in step S140 is performed on new document type metadata. Each document type metadata index in the database 46a is associated with the general type metadata in the database 44a and the document type metadata in the database 46a using a key.

After the processing in step S140, the database system 44 uses the API of the natural language processing system 48 to cooperate with the full text search system 46, and causes the natural language processing system 48 to perform natural language processing on each document type metadata (step S150). The natural language processing system 48 uses the API of the full-text search system 46 to acquire each document type metadata in the database 46a of the full-text search system 46, performs natural language processing on each document type metadata, and calculates the result of the calculation. is stored in the database 46a and proposed to the user (step S160). The process in step S160 is performed on new document type metadata, similar to the process in step S140. The calculation results of natural language processing for each document type metadata in the database 46a are respectively associated with the general type metadata in the database 44a, the document type metadata in the database 46a, and the document type metadata index using a key.

As described above, examples of natural language processing include document vector calculation processing, complementation processing, proofreading processing, classification processing, search processing for similar document type metadata, and the like. Further, natural language processing can be performed using deep learning techniques such as RNN and BERT. In the embodiment, since the natural language processing system 48 is configured as a multi-tenant system, natural language processing can be performed with higher accuracy than when configured as a single tenant system. In the embodiment, natural language processing includes at least document vector calculation processing, similar document type metadata search processing, and complementation processing. Since the document vector calculation process does not form the core of the present invention, a detailed explanation will be omitted.

The search process for similar document type metadata will be explained. This search process, for example, calculates the cosine distance between document vectors of each other document type metadata for the target document type metadata, and searches for similar document type metadata using the calculated cosine distance. This is done by When the full-text search system 46 is implemented with calculation processing of the cosine distance between document vectors, the natural language processing system 48 uses the API of the full-text search system 46 to cooperate with the full-text search system 46 and perform the full-text search system 46. 46 may be configured to perform similar document-type metadata search processing.

The complementation process will be explained. In the embodiment, the document type metadata created by the process of step S120 includes an additional item, and the information of the additional item is left blank. The complementation process in this case is performed as follows, for example. The natural language processing system 48 uses deep learning (for example, RNN, BERT, etc.) technology to perform calculations for solving a fill-in-the-blank problem for document-type metadata in which additional item information has holes. When similar document type metadata is being searched for using similar document type metadata search processing, it is conceivable to solve a fill-in-the-blank problem by considering the similar document type metadata. Then, the natural language processing system 48 stores the fill-in-the-blank candidates as the calculation result in the database 46a of the full-text search system 46 and transmits them to the second main management system 42. The second main management system 42 acquires the corresponding general type metadata of the database 44a of the database system 44, fills in the blanks with information in the acquired general type metadata, and fills in the blanks with the information in the general type metadata after the completion. By transmitting the data to the user terminal 50 and displaying it on the display unit, fill-in candidates are proposed to the user. Note that instead of the general metadata after complementation, only candidates for filling in gaps in the information may be suggested to the user.

When supplementation processing is performed on document type metadata as shown in Figure 6, that is, document type metadata with holes in the logical name, description, and information on the presence or absence of personal information, the information in general type metadata as shown in Figure 5 is General metadata as shown in FIG. 7, in which holes are filled with filling candidates, is proposed to the user. Note that instead of this, only the logical name, description, and fill-in-the-blank candidates for the presence or absence of personal information may be suggested to the user.

Then, when the user approves the filling-in candidates, a notification to that effect is sent to the second main management system 42, and the second main management system 42 stores general metadata (general metadata) in which the information is filled in with the filling-in candidates. For example, see FIG. 7) is updated by overwriting the general metadata (see FIG. 5) in the database 44a of the database system 44 that has a hole in the information (for example, see FIG. 5). Step S170).

When the general type metadata in the database 44a is updated in this way, the processes of steps S180 to S220 are performed in the same way as the processes of steps S120 to S160 described above for the updated general type metadata, and the metadata acquisition related process of FIG. ends. Specifically, the details are as follows. Similar to the process in step S120, the database system 44 creates document-type metadata from the updated general-type metadata in the database 44a, and uses the created document-type metadata in the database 46a of the full-text search system 46 before the update. The document type metadata is updated by overwriting the document type metadata (step S180). In the case of general type metadata as shown in FIG. 7, document type metadata as shown in FIG. 8 is created by the process of step S180. Subsequently, similar to the processing in steps S130 and S140, in accordance with instructions from the database system 44, the full-text search system 46 creates an index for the updated document type metadata in the database 46a, and uses the created index in the database 46a. The index of the document type metadata is updated by overwriting the index before the update (steps S190, S200). Then, similar to the processing in steps S150 and S160, in accordance with instructions from the database system 44, the natural language processing system 48 performs natural language processing (for example, document vector calculation processing, etc.) on the updated document type metadata in the database 46a. ) and overwrites the calculation result in the database 46a before the update, thereby updating the calculation result of the natural language processing (steps S210, S220).

When the metadata acquisition-related process of FIG. 2 is thus completed, the database system 44 transmits a notification that the metadata acquisition-related process has been completed to the user terminal 50 via the second main management system 42, and displays the message on the display unit. let

Next, a description will be given of metadata search processing when a user searches for desired metadata. FIG. 9 is a flowchart illustrating an example of metadata search processing. This process is performed by each of the metadata management system 20 when the user operates the input section of the user terminal 50, specifies one or more search strings, and instructs the execution of this process. Executed by the system. Note that when an instruction to execute this process is given, the user terminal 50 transmits one or more search character strings and a search instruction to the second main management system 42.

In the metadata search process of FIG. 9, the second main management system 42 transmits a search character string and a search instruction to the full text search system 46 (step S300). The full-text search system 46 uses the search string and the index of each document type metadata in the database 46a and the calculation results of natural language processing (for example, the calculation results of document vectors, the search results of similar document type metadata, etc.). The search results are sent to the second main management system 42 (step S310). As a full-text search method, for example, each document type metadata is scored using the search string and the index of each document type metadata or the calculation result of natural language processing, and the results are arranged in order of highest score. can be mentioned. In this way, document-type metadata containing content that is the same as or similar to (related to) the search character string can be searched relatively easily and in a short time.

The second main management system 42 transmits the search results from the full text search system 46 and general type metadata and/or document type metadata related to the search results in the database 44a of the database system 44 to the user terminal 50. (Step S320), and the metadata search process ends. The user terminal 50 displays this information on the display section. For example, this display method can be used to link parts of general type metadata and/or document type metadata (e.g., table names, file names, etc.) in the order of the above-mentioned scores, such as displaying search engine results. When a link is selected, all items of general type metadata and/or document type metadata are displayed.

In the embodiment, as described above, each document type metadata is created from each general type metadata, and an index of each document type metadata is also created. Therefore, by performing a full-text search using the search string and the index of each document type metadata and the calculation result of natural language processing, even if the metadata schema of each general type metadata is not unified, Desired metadata (general type metadata and document type metadata) can be easily obtained. For example, even if the number of metadata is sufficiently large, desired metadata can be acquired in a short time. Furthermore, by performing natural language processing on each document type metadata, it is possible to complement the information of each document type metadata and to assist the user in searching for similar document type metadata. For example, in the case of general-type metadata, it is difficult to perform natural language processing, and it would take a lot of effort for humans to manually complete the information, but by creating document-type metadata, it is difficult to perform natural language processing. , it is possible to easily supplement information by performing natural language processing.

In the metadata management system 20 of the embodiment described above, each general type metadata is acquired from the

databases

62, 72 and the

tools

64, 74 that are the acquisition targets of the

metadata sources

60, 70, and from each acquired general type metadata. Create each document type metadata. Therefore, an index of each document type metadata is created, natural language processing is performed on each document type data, and search strings and indexes of each document type metadata and calculation results of natural language processing are created according to the user's instructions. By performing a full-text search for each document type metadata using (metadata) can be easily obtained. Furthermore, by performing natural language processing on each document type data, it is possible to complement the information of each document type metadata and to assist the user in searching for similar document type metadata.

In the metadata management system 20 of the embodiment, the database 62 and tools 64 of the metadata source 60 are exposed to the first main management system 32 via API, and the first main management system 32 can access the database 62 and tools 64 to be acquired. The API of the tool 64 is used to obtain each general type metadata from these. However, at least a portion of the database 62 and tools 64 may not be disclosed to the first main management system 32 via API. If the database 62 or tool 64 to be acquired is not disclosed to the first main management system 32 via API, the first main management system 32 sends an acquisition request to the database 62 or tool 64 to be acquired. It is conceivable that the receiving database 62 or tool 64 sends each general type metadata to the first main management system 32 .

In the metadata management system 20 of the embodiment, the database 72 and tools 74 of the metadata source 70 are exposed to the second main management system 42 via API, and the second main management system 42 is responsible for the database 72 and tools 74 to be acquired. Each general type metadata is obtained from these using the API of the tool 74. However, at least a portion of the database 72 and tools 74 may not be exposed to the second main management system 42 through the API. If the database 72 or tool 74 to be acquired is not disclosed to the second main management system 42 via the API, the second main management system 42 sends an acquisition request to the database 72 or tool 74 to be acquired, and It is conceivable that the receiving database 72 or tool 74 sends each generic metadata to the second main management system 42 .

In the metadata management system 20 of the embodiment, each tree-structured general type metadata is acquired from the

target databases

62, 72 and

tools

64, 74, and each outline-structured document type metadata is acquired from each acquired general type metadata. The data was created. However, the structure of the general metadata to be acquired may be any graph structure and is not limited to a tree structure. If the structure of general type metadata is a graph structure other than a tree structure, for example, at least part of it is a circular structure or a net structure, the structure of document type metadata is not an outline structure, but, for example, reference by linking to the outline structure. It is necessary to create a structure that includes the following.

In the metadata management system 20 of the embodiment, the processing in steps S160 to S220 of the metadata acquisition related processing in FIG. 2, specifically, natural language processing for document type data (step S160), general type meta Updating data (step S170), updating the document type metadata in the database 46a (step S180), and updating the index of the document type metadata in the database 46a and the calculation results of natural language processing (steps S190 to S220) are automatically performed. It was decided that it would be carried out. However, these processes may be performed in response to instructions from the user.

In the metadata management system 20 of the embodiment, in the process of step S110 of the metadata acquisition related process in FIG. At least one of the first item, second item, and third item is added to each new general type metadata and stored in the database 44a. However, the second main management system 42 may store each new general type metadata as is in the database 44a. In this case, the processes after step S170 may be performed automatically or may be performed in response to instructions from the user.

In the metadata management system 20 of the embodiment, in the process of step S310 of the metadata search process in FIG. A full text search was performed using the calculation results. However, the full text search system 46 may perform a full text search without using the calculation results of natural language processing.

The metadata management system 20 of the embodiment includes a database system 44, a second main management system 42, a full text search system 46, and a natural language processing system 48. However, at least two of these may be configured as an integrated system.

The metadata management system 20 of the embodiment includes a first main management system 32 in an on-premises environment, and a database system 44, a second main management system 42, a full text search system 46, and a natural language processing system 48 in a cloud environment. I decided to prepare. However, in addition to the first main management system 32, the on-premises environment may include a database system 44, a second main management system 42, a full text search system 46, and a natural language processing system 48. In this case, the first main management system 32 and the second main management system 42 may be configured integrally.

The correspondence between the main elements of the embodiments and the main elements of the invention described in the section of means for solving the problems will be explained. In the embodiment, the first main management system 32 and the second main management system 42 correspond to a "general type metadata acquisition section", and the database system 44 corresponds to a "document type metadata creation section". Further, the full text search system 46 corresponds to a "full text search unit". The natural language processing system 48 corresponds to a "natural language processing unit".

The correspondence relationship between the main elements of the examples and the main elements of the invention described in the column of means for solving the problem is that the example implements the invention described in the column of means for solving the problem. Since this is an example for specifically explaining a form for solving the problem, it is not intended to limit the elements of the invention described in the column of means for solving the problems. In other words, the interpretation of the invention described in the column of means for solving the problem should be based on the description in that column, and the examples are based on the description of the invention described in the column of means for solving the problem. This is just one specific example.

Although the embodiments of the present disclosure have been described above, the present disclosure is not limited to these embodiments in any way, and may be modified in various forms without departing from the gist of the present disclosure. Of course, it can be implemented.

The present disclosure can be used in the field of managing metadata.

Claims

A metadata management system that manages metadata,
a general type metadata acquisition unit that acquires each general type metadata, each of which is metadata defined by an arbitrary metadata schema;
a document type metadata creation unit that creates each document type metadata from each of the general type metadata;
A metadata management system with.
The metadata management system according to claim 1,
The document type metadata creation unit creates the document type metadata from the general type metadata having a graph structure.
Metadata management system.
3. The metadata management system according to claim 2,
The document type metadata creation unit creates the document type metadata having an outline structure from the general type metadata having a tree structure among the graph structures.
Metadata management system.
4. The metadata management system according to claim 3,
The general type metadata has first metadata about a data item holding a single value and second metadata about a set of the data items,
The first metadata has a first common part, a first metadata source dependent part, and a first user defined part,
The second metadata has a second common part, a second metadata source dependent part, and a second user defined part,
The general type metadata is
having the second common part as a vertex;
a set of the second metadata source dependent part, the second user definition part, and the first common part as a child node of the second common part;
having the first metadata source dependent part and the first user defined part as child nodes of the first common part;
The document type metadata is
As a chapter, it has a second common part related part related to the second common part,
The sections include a second metadata source dependent part related part related to the second metadata source dependent part, a second user defined part related part related to the second user defined part, and the first common part. a first common part set related part related to the set of;
a first common part related part related to the first common part,
a first metadata source dependent part related part related to the first metadata source dependent part; and a first user defined part related part related to the first user defined part;
Metadata management system.
A metadata management system according to any one of claims 1 to 4, comprising:
The general type metadata acquisition unit acquires each of the general type metadata, and includes a first item set by the user, a second item proposed and approved by the user, a predetermined third item, adding at least one of the
Metadata management system.
A metadata management system according to any one of claims 1 to 5, comprising:
a full-text search unit that, when a user sets one or more search strings and issues a search instruction, performs a full-text search on each of the document type metadata using the search strings;
A metadata management system further comprising:
A metadata management system according to any one of claims 1 to 6, comprising:
further comprising a natural language processing unit that performs natural language processing on the document type metadata,
At least the natural language processing unit is configured as a multi-tenant system in a cloud environment.
Metadata management system.
8. The metadata management system according to claim 7,
The natural language processing unit proposes the calculation result to the user,
When the calculation result is approved by the user, the general type data acquisition unit updates the general type data to reflect the calculation result.
Metadata management system.
A metadata management method for managing metadata, the method comprising:
(a) obtaining each general type of metadata, each of which is metadata specified by an arbitrary metadata schema;
(b) creating each document type metadata from each of the general type metadata;
A metadata management method having.
A program for making a computer function as a metadata management system for managing metadata,
(a) obtaining each general type of metadata, each of which is metadata specified by an arbitrary metadata schema;
(b) creating each document type metadata from each of the general type metadata;
A program with