WO2025141700A1 - 情報処理装置、情報処理方法及びプログラム - Google Patents

情報処理装置、情報処理方法及びプログラム Download PDF

Info

Publication number
WO2025141700A1
WO2025141700A1 PCT/JP2023/046681 JP2023046681W WO2025141700A1 WO 2025141700 A1 WO2025141700 A1 WO 2025141700A1 JP 2023046681 W JP2023046681 W JP 2023046681W WO 2025141700 A1 WO2025141700 A1 WO 2025141700A1
Authority
WO
WIPO (PCT)
Prior art keywords
name
target
matching target
inference
description
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2023/046681
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
将人 藤武
雄輝 奥村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fast Accounting Co Ltd
Original Assignee
Fast Accounting Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fast Accounting Co Ltd filed Critical Fast Accounting Co Ltd
Priority to JP2023579770A priority Critical patent/JP7454156B1/ja
Priority to PCT/JP2023/046681 priority patent/WO2025141700A1/ja
Priority to JP2024028010A priority patent/JP2025102599A/ja
Publication of WO2025141700A1 publication Critical patent/WO2025141700A1/ja
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Definitions

  • the present invention relates to an information processing device, an information processing method, and a program.
  • Non-Patent Document 1 A technology is known that uses a machine learning model that is fine-tuned from a pre-trained language model to determine name matching (for example, Non-Patent Document 1).
  • the present invention was made in consideration of these points, and aims to improve the accuracy of the name matching task while maintaining versatility.
  • a trained model that, when inference data is input in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, outputs information indicating whether or not the first matching target in the input inference data matches the second matching target in the inference data, and (1) outputs information indicating whether or not the name of the first matching target, the name of the second matching target, the description of the first matching target, the description of the second matching target, and the name of the second matching target match based on a first learning dataset in which a name of the first matching target, a name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether or not the first matching target and the second matching target match
  • the system includes a memory unit that stores the trained model that has learned a name matching task that, when a description of a first name matching target and a description of
  • the description of the first matching target associated in the inference data and the first learning dataset may include information indicating details of the first matching target, which is a product, and the price of the first matching target
  • the description of the second matching target associated in the inference data and the first learning dataset may include information indicating details of the second matching target, which is a product, and the price of the second matching target
  • the description of the estimated target associated in the second learning dataset may include information indicating details of the estimated target.
  • the trained model may have trained a generic language model to perform the name matching task based on the first training data set and the title inference task based on the second training data set.
  • the acquisition unit may acquire the first training data set and the second training data set
  • the information processing device may further include a learning unit that generates the trained model trained on (1) the name matching task based on the first training data set acquired by the acquisition unit and (2) the title inference task based on the second training data set, and stores the trained trained model in the storage unit.
  • the learning unit may train the name matching task based on the first training data set and the title inference task based on the second training data set in parallel in a single training process.
  • a computer executes an inference dataset, the acquisition unit acquiring inference data in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, and a trained model stored in a storage unit, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target in the inference data when the inference data is input, the trained model including: (1) a first learning dataset in which a name of the first matching target, a name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether or not the first matching target matches the second matching target are associated;
  • the method includes: (1) a name matching task that outputs information indicating whether the first name matching target and the second name matching target match when a name of a first name matching target, a name of a second name matching target, a description of the first name
  • the method includes a step of inputting the inference data into the trained model that has learned the above, determining whether the first name matching target in the inference data matches the second name matching target in the inference data, and a step of outputting the result determined in the determining step.
  • a computer is provided with an acquisition unit that acquires inference data, the inference data being a dataset in which a name of a first matching target, a name of a second matching target, a description of the first matching target, and a description of the second matching target are associated, and a trained model stored in a storage unit, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target in the inference data when the inference data is input, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target in the inference data matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target matches the second matching target when the inference data is input, the trained model outputting information indicating whether or not the first matching target
  • the present invention makes it possible to improve the accuracy of the name matching task while maintaining versatility.
  • FIG. 1 is a diagram for explaining an overview of an information processing system S.
  • FIG. 1 is a diagram illustrating an example of a learning dataset.
  • 1 is a block diagram showing a configuration of an information processing device 1.
  • FIG. 11 is a flowchart for explaining a learning flow in a learning unit 134.
  • 11 is a diagram showing an example of a prompt acquired by an acquisition unit 131.
  • FIG. 4 is a flowchart showing a process flow in the information processing device 1.
  • Fig. 1 is a diagram for explaining an overview of an information processing system S.
  • Fig. 1(a) shows the configuration of the information processing system S.
  • the information processing system S is a system for performing name matching.
  • Name matching is a task executed by a machine learning model, and is a task of determining whether or not multiple given targets match.
  • the subject of name matching performed by the information processing device system S is, for example, the name of a product or service, but is not limited to this.
  • the information processing system S may also perform name matching on corporate names, personal names, or other names.
  • the information processing system S has an information processing device 1 and an information terminal 2.
  • the information processing device 1 and the information terminal 2 are connected so as to be able to communicate with each other via a network.
  • the information processing device 1 is a device for performing name matching.
  • One example of the information processing device 1 is a server.
  • the information processing device 1 trains a machine learning model, and when data to be matched is given, it uses the machine learning model to determine whether or not the target in the given data matches.
  • the information terminal 2 is a terminal used by a user of the information processing system S.
  • the information terminal 2 transmits a data set to be used for learning or inference to the information processing device 1, instructs the information processing device 1 to execute learning or inference, receives the inference results from the information processing device 1, and displays them on the display unit.
  • the information processing device 1 and the information terminal 2 may be configured as one unit.
  • the information processing device 1 has an input/output interface, accepts operations from the user, and displays the inference results.
  • the information processing device 1 stores a pre-trained model M1.
  • the pre-trained model M1 is a general-purpose language model, and is a trained model that has been trained to be able to execute natural language processing tasks based on a large amount of data set.
  • the information processing device 1 trains the pre-trained model M1 to perform a name matching task and a title inference task, and generates a trained model M2.
  • the name matching task is a task in which the names of multiple matching targets and text describing each of the targets are given, and the task determines whether the multiple matching targets match.
  • the name of the matching target indicates the name of the product, natural person, legal entity, etc., of the matching target.
  • the description of the matching target indicates the nature of the matching target. For example, if the matching target is a product, the description of the matching target includes the product's size, color, function, place of manufacture, manufacturer, seller, model number, operating environment, raw materials, selling points, price, etc.
  • the description of the subject of the name matching will include information such as date of birth, place of origin, alma mater, occupation, and achievements. If the subject of the name matching is a corporation, the description of the subject of the name matching will include information such as the corporation's address, number of employees, year of establishment, history, composition of officers, and sales.
  • the information processing device 1 trains the pre-trained model M1 to perform a name matching task based on a first training dataset.
  • An example of the first training dataset is shown in FIG. 2(a).
  • the name of the first matching target, the name of the second matching target, a description of the first matching target, a description of the second matching target, and a label indicating whether the first matching target and the second matching target match each other are associated with each other.
  • the title inference task is a task in which text describing an object whose title is to be inferred is given, and the title of the object described by the given text is generated.
  • the information processing device 1 trains the pre-trained model M1 to perform the title inference task based on the second training dataset.
  • An example of the second training dataset is shown in FIG. 2(b).
  • the name of the object to be inferred is associated with a description of the object to be inferred.
  • the model M2 can learn named entities used in the description that may affect the title. This enables the model to recognize important expressions in the text that affect the results of name matching. As a result, it is expected that the accuracy of the name matching task will improve without compromising the versatility of the model.
  • the trained model M2 is trained to receive inference data D1 and output a determination result D2 corresponding to the input inference data.
  • the inference data D1 is associated with the name of the first matching target, the name of the second matching target, a description of the first matching target, and a description of the second matching target.
  • the determination result D2 indicates whether the first matching target in the inference data matches the second matching target in the inference data.
  • the information processing device 1 inputs the inference data D1 into the trained model M2 and outputs the judgment result D2.
  • FIG. 3 is a block diagram showing the configuration of the information processing device 1.
  • the information processing device 1 has a communication unit 11, a storage unit 12, and a control unit 13.
  • the control unit 13 has an acquisition unit 131, a determination unit 132, an output unit 133, and a learning unit .
  • the communication unit 11 is a communication interface for sending and receiving data with other devices via a network.
  • the memory unit 12 is a storage medium including a ROM (Read Only Memory), a RAM (Random Access Memory), an SSD (Solid State Drive), a hard disk drive, etc.
  • the memory unit 12 pre-stores programs to be executed by the control unit 13.
  • the memory unit 12 stores a pre-trained model M1 and a trained model M2.
  • the control unit 13 is a processor such as a CPU (Central Processing Unit).
  • the control unit 13 executes the programs stored in the memory unit 12, thereby functioning as an acquisition unit 131, a determination unit 132, an output unit 133, and a learning unit 134.
  • the acquisition unit 131 acquires the inference data D1. As an example, the acquisition unit 131 acquires the inference data D1 from the information terminal 2. The acquisition unit 131 may acquire the inference data D1 from the storage unit 12, or from an external device (not shown). The acquisition unit 131 may acquire a first learning data set and a second learning data set, and output them to the learning unit 134.
  • the name matching may be performed based on a data set that includes the price of the product.
  • the learning unit 134 causes the pre-trained model M1 to execute a name matching task based on the first training data set and output the results (S04).
  • the learning unit 134 causes the pre-trained model M1 to execute a title inference task based on the second training data set and output the results (S05).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2023/046681 2023-12-26 2023-12-26 情報処理装置、情報処理方法及びプログラム Pending WO2025141700A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2023579770A JP7454156B1 (ja) 2023-12-26 2023-12-26 情報処理装置、情報処理方法及びプログラム
PCT/JP2023/046681 WO2025141700A1 (ja) 2023-12-26 2023-12-26 情報処理装置、情報処理方法及びプログラム
JP2024028010A JP2025102599A (ja) 2023-12-26 2024-02-28 情報処理装置、情報処理方法及びプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2023/046681 WO2025141700A1 (ja) 2023-12-26 2023-12-26 情報処理装置、情報処理方法及びプログラム

Publications (1)

Publication Number Publication Date
WO2025141700A1 true WO2025141700A1 (ja) 2025-07-03

Family

ID=90273385

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/046681 Pending WO2025141700A1 (ja) 2023-12-26 2023-12-26 情報処理装置、情報処理方法及びプログラム

Country Status (2)

Country Link
JP (2) JP7454156B1 (https=)
WO (1) WO2025141700A1 (https=)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019185244A (ja) * 2018-04-05 2019-10-24 富士通株式会社 学習プログラム及び学習方法
WO2023132029A1 (ja) * 2022-01-06 2023-07-13 日本電気株式会社 情報処理装置、情報処理方法及びプログラム
WO2023162206A1 (ja) * 2022-02-28 2023-08-31 日本電気株式会社 情報処理装置、情報処理方法及び情報処理プログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019185244A (ja) * 2018-04-05 2019-10-24 富士通株式会社 学習プログラム及び学習方法
WO2023132029A1 (ja) * 2022-01-06 2023-07-13 日本電気株式会社 情報処理装置、情報処理方法及びプログラム
WO2023162206A1 (ja) * 2022-02-28 2023-08-31 日本電気株式会社 情報処理装置、情報処理方法及び情報処理プログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RALPH PEETERS; CHRISTIAN BIZER: "Entity Matching using Large Language Models", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 17 October 2023 (2023-10-17), 201 Olin Library Cornell University Ithaca, NY 14853, XP091638078 *

Also Published As

Publication number Publication date
JP2025102599A (ja) 2025-07-08
JP7454156B1 (ja) 2024-03-22
JPWO2025141700A1 (https=) 2025-07-03

Similar Documents

Publication Publication Date Title
US20250005627A1 (en) Configurable relevance service platform incorporating a relevance test driver
JP2022537250A5 (https=)
CN112424748B (zh) 源代码文件推荐通知
CN109523342A (zh) 服务策略生成方法及装置、电子设备、存储介质
JP7471760B1 (ja) 情報処理方法、情報処理システム及びプログラム
CN110033285A (zh) 房源信息发布方法、装置、设备及计算机可读存储介质
CN111080399A (zh) 一种商品信息处理方法和装置
US10922219B2 (en) A/B test apparatus, method, program, and system
US20230012316A1 (en) Automation of leave request process
WO2025141700A1 (ja) 情報処理装置、情報処理方法及びプログラム
CN115202916A (zh) 测试数据获取方法、系统、电子设备及可读存储介质
US20250390518A1 (en) Evaluating context-specific content generated by a generative artificial intelligence model
CN111199287A (zh) 一种特征工程实时推荐方法、装置及电子设备
TWI848294B (zh) 用於機器學習之電腦模型之反覆訓練
CN108549722B (zh) 多平台数据发布方法、系统及介质
US20240386009A1 (en) Systems, apparatuses, methods, and computer program products for backfilling of real-time data
US20180121978A1 (en) User-Assisted Processing of Receipts and Invoices
JP2022064865A (ja) コンピュータ実装方法、コンピュータプログラムおよびコンピュータシステム(非構造化ドキュメントからの構造化情報の抽出)
CN107180037B (zh) 人机交互方法和装置
CN115994267B (zh) 实时用户画像刻画方法、装置、计算机设备和存储介质
US12293168B2 (en) Generating digital assistants from source code repositories
US12306849B2 (en) Systems and methods for dynamically generating new data rules
CN115545823B (zh) 产品推荐方法、装置、计算机设备及存储介质
CN116975440B (zh) 属性分类方法、装置和计算机设备
CN117971399B (zh) 容器数量调节方法、智能计算云操作系统以及计算平台

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2023579770

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23963054

Country of ref document: EP

Kind code of ref document: A1