JPWO2023064033A5 - - Google Patents

Info

Publication number
JPWO2023064033A5
JPWO2023064033A5 JP2024522110A JP2024522110A JPWO2023064033A5 JP WO2023064033 A5 JPWO2023064033 A5 JP WO2023064033A5 JP 2024522110 A JP2024522110 A JP 2024522110A JP 2024522110 A JP2024522110 A JP 2024522110A JP WO2023064033 A5 JPWO2023064033 A5 JP WO2023064033A5
Authority
JP
Japan
Prior art keywords
layers
machine learning
learning model
layer
parameter values
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2024522110A
Other languages
English (en)
Japanese (ja)
Other versions
JP2024539003A (ja
JP2024539003A5 (https=
Publication date
Priority claimed from US17/735,651 external-priority patent/US12512091B2/en
Application filed filed Critical
Publication of JP2024539003A publication Critical patent/JP2024539003A/ja
Publication of JPWO2023064033A5 publication Critical patent/JPWO2023064033A5/ja
Publication of JP2024539003A5 publication Critical patent/JP2024539003A5/ja
Pending legal-status Critical Current

Links

JP2024522110A 2021-10-12 2022-08-17 事前トレーニングされた言語モデルの単一のトランスフォーマ層からのマルチヘッドネットワークの微調整 Pending JP2024539003A (ja)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202163254740P 2021-10-12 2021-10-12
US63/254,740 2021-10-12
US17/735,651 2022-05-03
US17/735,651 US12512091B2 (en) 2021-10-12 2022-05-03 Fine-tuning multi-head network from a single transformer layer of pre-trained language model
PCT/US2022/040530 WO2023064033A1 (en) 2021-10-12 2022-08-17 Fine-tuning multi-head network from a single transformer layer of pre-trained language model

Publications (3)

Publication Number Publication Date
JP2024539003A JP2024539003A (ja) 2024-10-28
JPWO2023064033A5 true JPWO2023064033A5 (https=) 2025-08-04
JP2024539003A5 JP2024539003A5 (https=) 2025-08-04

Family

ID=85798249

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2024522110A Pending JP2024539003A (ja) 2021-10-12 2022-08-17 事前トレーニングされた言語モデルの単一のトランスフォーマ層からのマルチヘッドネットワークの微調整

Country Status (6)

Country Link
US (2) US12512091B2 (https=)
JP (1) JP2024539003A (https=)
KR (1) KR20240089615A (https=)
CN (1) CN118140230A (https=)
GB (1) GB2631139A (https=)
WO (1) WO2023064033A1 (https=)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12548552B2 (en) * 2021-11-19 2026-02-10 International Business Machines Corporation Dynamic language selection of an AI voice assistance system
US11947935B2 (en) * 2021-11-24 2024-04-02 Microsoft Technology Licensing, Llc. Custom models for source code generation via prefix-tuning
US20240061835A1 (en) * 2022-08-22 2024-02-22 Oracle International Corporation System and method of selective fine-tuning for custom training of a natural language to logical form model
US20240169165A1 (en) * 2022-11-17 2024-05-23 Samsung Electronics Co., Ltd. Automatically Generating Annotated Ground-Truth Corpus for Training NLU Model
US12562163B2 (en) * 2023-05-12 2026-02-24 Servicenow, Inc. Bidirectional assistant for development platforms
CN116774140A (zh) * 2023-06-26 2023-09-19 南京邮电大学 基于残差注意力网络的无网格信号源doa估计方法
US20250005282A1 (en) * 2023-06-29 2025-01-02 Amazon Technologies, Inc. Domain entity extraction for performing text analysis tasks
CN118446218B (zh) * 2024-05-16 2024-11-01 西南交通大学 一种对抗式阅读理解嵌套命名实体识别方法
CA3253531A1 (en) * 2024-06-14 2026-01-19 The Toronto-Dominion Bank Context retrieval for in-context learning model
WO2026000314A1 (en) * 2024-06-27 2026-01-02 Beijing Youzhuju Network Technology Co., Ltd. Model-based task processing
JP7658644B1 (ja) * 2024-10-21 2025-04-08 スパーブエーアイ カンパニー リミテッド 事前学習されたベースモデルに基づいたカスタムモデルを学習する方法及びそれを用いた学習装置{method for training custom model based on pre-trained base model and learning device using the same}
CN119418321B (zh) * 2024-10-30 2025-09-30 上海哔哩哔哩科技有限公司 模型训练方法、用于检测和识别文本的方法及相关装置
CN119418319B (zh) * 2024-10-30 2025-09-30 上海哔哩哔哩科技有限公司 模型训练方法、文本检测方法、装置、介质和程序产品
CN119418320B (zh) * 2024-10-30 2025-09-30 上海哔哩哔哩科技有限公司 一种模型训练方法、装置、介质和程序产品
CN119915374B (zh) * 2025-04-03 2025-11-14 浙江潮汐力科技有限公司 故障监测方法、装置、设备、存储介质和程序产品

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11138392B2 (en) * 2018-07-26 2021-10-05 Google Llc Machine translation using neural network models
US20200042864A1 (en) 2018-08-02 2020-02-06 Veritone, Inc. Neural network orchestration
US11556778B2 (en) 2018-12-07 2023-01-17 Microsoft Technology Licensing, Llc Automated generation of machine learning models
US20210279596A1 (en) 2020-03-06 2021-09-09 Hitachi, Ltd. System for predictive maintenance using trace norm generative adversarial networks
US20220094713A1 (en) * 2020-09-21 2022-03-24 Sophos Limited Malicious message detection
US12141701B2 (en) * 2021-01-21 2024-11-12 International Business Machines Corporation Channel scaling: a scale-and-select approach for selective transfer learning
US11875898B2 (en) * 2021-05-26 2024-01-16 Merative Us L.P. Automatic condition diagnosis using an attention-guided framework
US20230106669A1 (en) * 2021-09-27 2023-04-06 X Development Llc Binding affinity prediction using neural networks

Similar Documents

Publication Publication Date Title
JPWO2023064033A5 (https=)
GB2631139A (en) Fine-tuning multi-head network from a single transformer layer of pre-trained language model
Jankov et al. Declarative recursive computation on an rdbms, or, why you should use a database for distributed machine learning
US11921711B2 (en) Trained sequence-to-sequence conversion of database queries
JP6939384B2 (ja) データ処理装置、方法およびプログラム
CN109710737B (zh) 一种基于结构化查询的智能推理方法
CN119272872B (zh) 通过应用机器学习模型来执行一批请求的方法以及非暂态计算机可读存储介质
JP7100422B2 (ja) データプロパティ認識のための装置、プログラム、及び方法
CN113836174B (zh) 基于强化学习dqn算法的异步sql连接查询优化方法
JP2005502934A5 (https=)
DE102017109239A1 (de) Computerimplementiertes verfahren, computerlesbares medium und heterogenes rechnersystem
JPWO2021050170A5 (https=)
CN117909458A (zh) 基于llm模型的模具专业问答系统的构建方法
DE112016002370T5 (de) Lokales persistent machen von daten für eine selektiv offline taugliche sprachaktion in einer sprachfähigen elektronischen vorrichtung
CN114691891B (zh) 一种面向知识图谱的问答推理方法
CN118093847B (zh) 应答信息生成方法、系统、装置、设备、介质及程序产品
CN119227817A (zh) 一种大模型多智能体协同的机械制造知识问答方法
JPWO2022159461A5 (https=)
CN109033084B (zh) 一种语义层次树构建方法以及装置
CN118503383A (zh) 基于聊天机器人的问答模型推理优化和加速方法及装置
JP2020057386A5 (https=)
CN104133891A (zh) 一种基于关系型数据库的海量结构化数据的存储方法
CN107315843A (zh) 海量结构化数据的存储方法和系统
CN116502683A (zh) 一种全流程并行加速脑仿真方法及系统
US20240086156A1 (en) Hybrid code combining imperative programming languages with declarative database operations to accomplish iterative logic