WO2025141700A1 - 情報処理装置、情報処理方法及びプログラム - Google Patents
情報処理装置、情報処理方法及びプログラム Download PDFInfo
- Publication number
- WO2025141700A1 WO2025141700A1 PCT/JP2023/046681 JP2023046681W WO2025141700A1 WO 2025141700 A1 WO2025141700 A1 WO 2025141700A1 JP 2023046681 W JP2023046681 W JP 2023046681W WO 2025141700 A1 WO2025141700 A1 WO 2025141700A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- name
- target
- matching target
- inference
- description
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Definitions
- the present invention relates to an information processing device, an information processing method, and a program.
- Non-Patent Document 1 A technology is known that uses a machine learning model that is fine-tuned from a pre-trained language model to determine name matching (for example, Non-Patent Document 1).
- the present invention was made in consideration of these points, and aims to improve the accuracy of the name matching task while maintaining versatility.
- a trained model that, when inference data is input in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, outputs information indicating whether or not the first matching target in the input inference data matches the second matching target in the inference data, and (1) outputs information indicating whether or not the name of the first matching target, the name of the second matching target, the description of the first matching target, the description of the second matching target, and the name of the second matching target match based on a first learning dataset in which a name of the first matching target, a name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether or not the first matching target and the second matching target match
- the system includes a memory unit that stores the trained model that has learned a name matching task that, when a description of a first name matching target and a description of
- the description of the first matching target associated in the inference data and the first learning dataset may include information indicating details of the first matching target, which is a product, and the price of the first matching target
- the description of the second matching target associated in the inference data and the first learning dataset may include information indicating details of the second matching target, which is a product, and the price of the second matching target
- the description of the estimated target associated in the second learning dataset may include information indicating details of the estimated target.
- the trained model may have trained a generic language model to perform the name matching task based on the first training data set and the title inference task based on the second training data set.
- the acquisition unit may acquire the first training data set and the second training data set
- the information processing device may further include a learning unit that generates the trained model trained on (1) the name matching task based on the first training data set acquired by the acquisition unit and (2) the title inference task based on the second training data set, and stores the trained trained model in the storage unit.
- the learning unit may train the name matching task based on the first training data set and the title inference task based on the second training data set in parallel in a single training process.
- a computer executes an inference dataset, the acquisition unit acquiring inference data in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, and a trained model stored in a storage unit, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target in the inference data when the inference data is input, the trained model including: (1) a first learning dataset in which a name of the first matching target, a name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether or not the first matching target matches the second matching target are associated;
- the method includes: (1) a name matching task that outputs information indicating whether the first name matching target and the second name matching target match when a name of a first name matching target, a name of a second name matching target, a description of the first name
- the method includes a step of inputting the inference data into the trained model that has learned the above, determining whether the first name matching target in the inference data matches the second name matching target in the inference data, and a step of outputting the result determined in the determining step.
- a computer is provided with an acquisition unit that acquires inference data, the inference data being a dataset in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, and a trained model stored in a storage unit, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target in the inference data when the inference data is input, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target
- the present invention makes it possible to improve the accuracy of the name matching task while maintaining versatility.
- FIG. 1 is a diagram for explaining an overview of an information processing system S.
- FIG. 1 is a diagram illustrating an example of a learning dataset.
- 1 is a block diagram showing a configuration of an information processing device 1.
- FIG. 11 is a flowchart for explaining a learning flow in a learning unit 134.
- 11 is a diagram showing an example of a prompt acquired by an acquisition unit 131.
- FIG. 4 is a flowchart showing a process flow in the information processing device 1.
- Fig. 1 is a diagram for explaining an overview of an information processing system S.
- Fig. 1(a) shows the configuration of the information processing system S.
- the information processing system S is a system for performing name matching.
- Name matching is a task executed by a machine learning model, and is a task of determining whether or not multiple given targets match.
- the subject of name matching performed by the information processing device system S is, for example, the name of a product or service, but is not limited to this.
- the information processing system S may also perform name matching on corporate names, personal names, or other names.
- the information processing system S has an information processing device 1 and an information terminal 2.
- the information processing device 1 and the information terminal 2 are connected so as to be able to communicate with each other via a network.
- the information processing device 1 is a device for performing name matching.
- One example of the information processing device 1 is a server.
- the information processing device 1 trains a machine learning model, and when data to be matched is given, it uses the machine learning model to determine whether or not the target in the given data matches.
- the information terminal 2 is a terminal used by a user of the information processing system S.
- the information terminal 2 transmits a data set to be used for learning or inference to the information processing device 1, instructs the information processing device 1 to execute learning or inference, receives the inference results from the information processing device 1, and displays them on the display unit.
- the information processing device 1 and the information terminal 2 may be configured as one unit.
- the information processing device 1 has an input/output interface, accepts operations from the user, and displays the inference results.
- the information processing device 1 stores a pre-trained model M1.
- the pre-trained model M1 is a general-purpose language model, and is a trained model that has been trained to be able to execute natural language processing tasks based on a large amount of data set.
- the information processing device 1 trains the pre-trained model M1 to perform a name matching task and a title inference task, and generates a trained model M2.
- the name matching task is a task in which the names of multiple matching targets and text describing each of the targets are given, and the task determines whether the multiple matching targets match.
- the name of the matching target indicates the name of the product, natural person, legal entity, etc., of the matching target.
- the description of the matching target indicates the nature of the matching target. For example, if the matching target is a product, the description of the matching target includes the product's size, color, function, place of manufacture, manufacturer, seller, model number, operating environment, raw materials, selling points, price, etc.
- the description of the subject of the name matching will include information such as date of birth, place of origin, alma mater, occupation, and achievements. If the subject of the name matching is a corporation, the description of the subject of the name matching will include information such as the corporation's address, number of employees, year of establishment, history, composition of officers, and sales.
- the information processing device 1 trains the pre-trained model M1 to perform a name matching task based on a first training dataset.
- An example of the first training dataset is shown in FIG. 2(a).
- the name of the first matching target, the name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether the first matching target and the second matching target match each other are associated with each other.
- the title inference task is a task in which text describing an object whose title is to be inferred is given, and the title of the object described by the given text is generated.
- the information processing device 1 trains the pre-trained model M1 to perform the title inference task based on the second training dataset.
- An example of the second training dataset is shown in FIG. 2(b).
- the name of the object to be inferred is associated with a description of the object to be inferred.
- the model M2 can learn named entities used in the description that may affect the title. This enables the model to recognize important expressions in the text that affect the results of name matching. As a result, it is expected that the accuracy of the name matching task will improve without compromising the versatility of the model.
- the trained model M2 is trained to receive inference data D1 and output a determination result D2 corresponding to the input inference data.
- the inference data D1 is associated with the name of the first matching target, the name of the second matching target, a description of the first matching target, and a description of the second matching target.
- the determination result D2 indicates whether the first matching target in the inference data matches the second matching target in the inference data.
- the information processing device 1 inputs the inference data D1 into the trained model M2 and outputs the judgment result D2.
- FIG. 3 is a block diagram showing the configuration of the information processing device 1.
- the information processing device 1 has a communication unit 11, a storage unit 12, and a control unit 13.
- the control unit 13 has an acquisition unit 131, a determination unit 132, an output unit 133, and a learning unit .
- the communication unit 11 is a communication interface for sending and receiving data with other devices via a network.
- the memory unit 12 is a storage medium including a ROM (Read Only Memory), a RAM (Random Access Memory), an SSD (Solid State Drive), a hard disk drive, etc.
- the memory unit 12 pre-stores programs to be executed by the control unit 13.
- the memory unit 12 stores a pre-trained model M1 and a trained model M2.
- the control unit 13 is a processor such as a CPU (Central Processing Unit).
- the control unit 13 executes the programs stored in the memory unit 12, thereby functioning as an acquisition unit 131, a determination unit 132, an output unit 133, and a learning unit 134.
- the acquisition unit 131 acquires the inference data D1. As an example, the acquisition unit 131 acquires the inference data D1 from the information terminal 2. The acquisition unit 131 may acquire the inference data D1 from the storage unit 12, or from an external device (not shown). The acquisition unit 131 may acquire a first learning data set and a second learning data set, and output them to the learning unit 134.
- the name matching may be performed based on a data set that includes the price of the product.
- the learning unit 134 causes the pre-trained model M1 to execute a name matching task based on the first training data set and output the results (S04).
- the learning unit 134 causes the pre-trained model M1 to execute a title inference task based on the second training data set and output the results (S05).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023579770A JP7454156B1 (ja) | 2023-12-26 | 2023-12-26 | 情報処理装置、情報処理方法及びプログラム |
| PCT/JP2023/046681 WO2025141700A1 (ja) | 2023-12-26 | 2023-12-26 | 情報処理装置、情報処理方法及びプログラム |
| JP2024028010A JP2025102599A (ja) | 2023-12-26 | 2024-02-28 | 情報処理装置、情報処理方法及びプログラム |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/046681 WO2025141700A1 (ja) | 2023-12-26 | 2023-12-26 | 情報処理装置、情報処理方法及びプログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025141700A1 true WO2025141700A1 (ja) | 2025-07-03 |
Family
ID=90273385
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/046681 Pending WO2025141700A1 (ja) | 2023-12-26 | 2023-12-26 | 情報処理装置、情報処理方法及びプログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (2) | JP7454156B1 (https=) |
| WO (1) | WO2025141700A1 (https=) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019185244A (ja) * | 2018-04-05 | 2019-10-24 | 富士通株式会社 | 学習プログラム及び学習方法 |
| WO2023132029A1 (ja) * | 2022-01-06 | 2023-07-13 | 日本電気株式会社 | 情報処理装置、情報処理方法及びプログラム |
| WO2023162206A1 (ja) * | 2022-02-28 | 2023-08-31 | 日本電気株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
-
2023
- 2023-12-26 WO PCT/JP2023/046681 patent/WO2025141700A1/ja active Pending
- 2023-12-26 JP JP2023579770A patent/JP7454156B1/ja active Active
-
2024
- 2024-02-28 JP JP2024028010A patent/JP2025102599A/ja active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019185244A (ja) * | 2018-04-05 | 2019-10-24 | 富士通株式会社 | 学習プログラム及び学習方法 |
| WO2023132029A1 (ja) * | 2022-01-06 | 2023-07-13 | 日本電気株式会社 | 情報処理装置、情報処理方法及びプログラム |
| WO2023162206A1 (ja) * | 2022-02-28 | 2023-08-31 | 日本電気株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
Non-Patent Citations (1)
| Title |
|---|
| RALPH PEETERS; CHRISTIAN BIZER: "Entity Matching using Large Language Models", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 17 October 2023 (2023-10-17), 201 Olin Library Cornell University Ithaca, NY 14853, XP091638078 * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2025102599A (ja) | 2025-07-08 |
| JP7454156B1 (ja) | 2024-03-22 |
| JPWO2025141700A1 (https=) | 2025-07-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250005627A1 (en) | Configurable relevance service platform incorporating a relevance test driver | |
| JP2022537250A5 (https=) | ||
| CN112424748B (zh) | 源代码文件推荐通知 | |
| CN109523342A (zh) | 服务策略生成方法及装置、电子设备、存储介质 | |
| JP7471760B1 (ja) | 情報処理方法、情報処理システム及びプログラム | |
| CN110033285A (zh) | 房源信息发布方法、装置、设备及计算机可读存储介质 | |
| CN111080399A (zh) | 一种商品信息处理方法和装置 | |
| US10922219B2 (en) | A/B test apparatus, method, program, and system | |
| US20230012316A1 (en) | Automation of leave request process | |
| WO2025141700A1 (ja) | 情報処理装置、情報処理方法及びプログラム | |
| CN115202916A (zh) | 测试数据获取方法、系统、电子设备及可读存储介质 | |
| US20250390518A1 (en) | Evaluating context-specific content generated by a generative artificial intelligence model | |
| CN111199287A (zh) | 一种特征工程实时推荐方法、装置及电子设备 | |
| TWI848294B (zh) | 用於機器學習之電腦模型之反覆訓練 | |
| CN108549722B (zh) | 多平台数据发布方法、系统及介质 | |
| US20240386009A1 (en) | Systems, apparatuses, methods, and computer program products for backfilling of real-time data | |
| US20180121978A1 (en) | User-Assisted Processing of Receipts and Invoices | |
| JP2022064865A (ja) | コンピュータ実装方法、コンピュータプログラムおよびコンピュータシステム(非構造化ドキュメントからの構造化情報の抽出) | |
| CN107180037B (zh) | 人机交互方法和装置 | |
| CN115994267B (zh) | 实时用户画像刻画方法、装置、计算机设备和存储介质 | |
| US12293168B2 (en) | Generating digital assistants from source code repositories | |
| US12306849B2 (en) | Systems and methods for dynamically generating new data rules | |
| CN115545823B (zh) | 产品推荐方法、装置、计算机设备及存储介质 | |
| CN116975440B (zh) | 属性分类方法、装置和计算机设备 | |
| CN117971399B (zh) | 容器数量调节方法、智能计算云操作系统以及计算平台 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| ENP | Entry into the national phase |
Ref document number: 2023579770 Country of ref document: JP Kind code of ref document: A |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23963054 Country of ref document: EP Kind code of ref document: A1 |