CN117574961B - 一种将适配器注入预训练模型的参数高效化方法和装置 - Google Patents
一种将适配器注入预训练模型的参数高效化方法和装置 Download PDFInfo
- Publication number
- CN117574961B CN117574961B CN202410051188.6A CN202410051188A CN117574961B CN 117574961 B CN117574961 B CN 117574961B CN 202410051188 A CN202410051188 A CN 202410051188A CN 117574961 B CN117574961 B CN 117574961B
- Authority
- CN
- China
- Prior art keywords
- training
- model
- adapter
- module
- training model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Machine Translation (AREA)
Abstract
Description
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410051188.6A CN117574961B (zh) | 2024-01-15 | 2024-01-15 | 一种将适配器注入预训练模型的参数高效化方法和装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410051188.6A CN117574961B (zh) | 2024-01-15 | 2024-01-15 | 一种将适配器注入预训练模型的参数高效化方法和装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117574961A CN117574961A (zh) | 2024-02-20 |
CN117574961B true CN117574961B (zh) | 2024-03-22 |
Family
ID=89892124
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410051188.6A Active CN117574961B (zh) | 2024-01-15 | 2024-01-15 | 一种将适配器注入预训练模型的参数高效化方法和装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117574961B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118301006A (zh) * | 2024-04-18 | 2024-07-05 | 清华大学 | 支持场景快速适应的数据中心网络流量模型训练方法 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437096A (zh) * | 2017-07-28 | 2017-12-05 | 北京大学 | 基于参数高效的深度残差网络模型的图像分类方法 |
CN110334689A (zh) * | 2019-07-16 | 2019-10-15 | 北京百度网讯科技有限公司 | 视频分类方法和装置 |
CN111160488A (zh) * | 2020-01-02 | 2020-05-15 | 中国民航大学 | 融合注意力选择机制的CondenseNet算法 |
CN113358346A (zh) * | 2021-06-07 | 2021-09-07 | 沈阳理工大学 | 基于小波包分解和bp神经网络的气阀故障诊断方法 |
WO2022126797A1 (zh) * | 2020-12-17 | 2022-06-23 | 之江实验室 | 基于多层级知识蒸馏预训练语言模型自动压缩方法及平台 |
CN116186171A (zh) * | 2022-12-19 | 2023-05-30 | 中国人民解放军战略支援部队信息工程大学 | 基于多头自注意力机制适配器的持续关系抽取方法及系统 |
CN116644316A (zh) * | 2023-05-31 | 2023-08-25 | 杭州电子科技大学 | 一种面向多模态多任务学习的轻量化适配网络学习方法 |
CN117077667A (zh) * | 2023-08-10 | 2023-11-17 | 浙江大学 | 一种基于适配器的语言模型知识注入方法和系统 |
CN117233960A (zh) * | 2023-11-15 | 2023-12-15 | 清华大学 | 基于智能光计算的光学系统在线设计方法与装置 |
CN117290429A (zh) * | 2023-11-24 | 2023-12-26 | 山东焦易网数字科技股份有限公司 | 通过自然语言调用数据系统接口的方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4163829A1 (en) * | 2021-10-05 | 2023-04-12 | Universität Zürich | Parameter-efficient method for training neural networks |
US20230325725A1 (en) * | 2022-04-12 | 2023-10-12 | Google Llc | Parameter Efficient Prompt Tuning for Efficient Models at Scale |
-
2024
- 2024-01-15 CN CN202410051188.6A patent/CN117574961B/zh active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437096A (zh) * | 2017-07-28 | 2017-12-05 | 北京大学 | 基于参数高效的深度残差网络模型的图像分类方法 |
CN110334689A (zh) * | 2019-07-16 | 2019-10-15 | 北京百度网讯科技有限公司 | 视频分类方法和装置 |
CN111160488A (zh) * | 2020-01-02 | 2020-05-15 | 中国民航大学 | 融合注意力选择机制的CondenseNet算法 |
WO2022126797A1 (zh) * | 2020-12-17 | 2022-06-23 | 之江实验室 | 基于多层级知识蒸馏预训练语言模型自动压缩方法及平台 |
CN113358346A (zh) * | 2021-06-07 | 2021-09-07 | 沈阳理工大学 | 基于小波包分解和bp神经网络的气阀故障诊断方法 |
CN116186171A (zh) * | 2022-12-19 | 2023-05-30 | 中国人民解放军战略支援部队信息工程大学 | 基于多头自注意力机制适配器的持续关系抽取方法及系统 |
CN116644316A (zh) * | 2023-05-31 | 2023-08-25 | 杭州电子科技大学 | 一种面向多模态多任务学习的轻量化适配网络学习方法 |
CN117077667A (zh) * | 2023-08-10 | 2023-11-17 | 浙江大学 | 一种基于适配器的语言模型知识注入方法和系统 |
CN117233960A (zh) * | 2023-11-15 | 2023-12-15 | 清华大学 | 基于智能光计算的光学系统在线设计方法与装置 |
CN117290429A (zh) * | 2023-11-24 | 2023-12-26 | 山东焦易网数字科技股份有限公司 | 通过自然语言调用数据系统接口的方法 |
Non-Patent Citations (2)
Title |
---|
Parameter-efficient fine-tuning of large-scale pre-trained language models;Ning Ding 等;nature machine intelligence;20230302;220-235 * |
基于大语言模型的问答技术研究进展综述;文森 等;数据分析与知识发现;20231113;1-14 * |
Also Published As
Publication number | Publication date |
---|---|
CN117574961A (zh) | 2024-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109767759B (zh) | 一种应用到端到端语音识别的cldnn结构的建立方法 | |
CN106126507B (zh) | 一种基于字符编码的深度神经翻译方法及系统 | |
WO2022126797A1 (zh) | 基于多层级知识蒸馏预训练语言模型自动压缩方法及平台 | |
Xiao et al. | History-based attention in Seq2Seq model for multi-label text classification | |
CN115064155B (zh) | 一种基于知识蒸馏的端到端语音识别增量学习方法及系统 | |
US20240070394A1 (en) | Systems and methods for ensembling soft prompts in few-shot fine-tuning of language models | |
CN110737764A (zh) | 一种个性化对话内容生成方法 | |
Lam et al. | Gaussian process lstm recurrent neural network language models for speech recognition | |
CN117574961B (zh) | 一种将适配器注入预训练模型的参数高效化方法和装置 | |
CN108415888A (zh) | 用于神经网络语言模型的压缩方法和系统 | |
US11941356B2 (en) | Systems and methods for multi-scale pre-training with densely connected transformer | |
EP3847584A1 (en) | System and method for synthesis of compact and accurate neural networks (scann) | |
CN118885558A (zh) | 一种基于轻量前馈网络适配器的预训练语言模型微调方法 | |
CN117151173A (zh) | 一种基于元学习的模型压缩方法及系统 | |
CN116994098B (zh) | 基于类别属性知识增强的大模型提示学习方法 | |
Ghorbani et al. | Domain expansion in DNN-based acoustic models for robust speech recognition | |
CN112612881A (zh) | 基于Transformer的中文智能对话方法 | |
CN116168401A (zh) | 基于多模态码本的文本图像翻译模型的训练方法 | |
CN116306808A (zh) | 一种联合动态剪枝和条件卷积的卷积神经网络压缩方法及装置 | |
CN115982586A (zh) | 针对少样本文本转sql任务流的半监督持续学习方法 | |
Wan et al. | Improved dynamic memory network for dialogue act classification with adversarial training | |
CN115422369A (zh) | 基于改进TextRank的知识图谱补全方法和装置 | |
CN114218953A (zh) | 一种医学文本命名实体识别方法 | |
CN117312491A (zh) | 一种机器阅读理解注意力方法、系统、介质、设备及终端 | |
CN116434742A (zh) | 基于无监督学习和迁移学习的低资源语音关键词检测方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20241231 Address after: No. 18 Guanyinqiao West Road, Jinjiang District, Chengdu City, Sichuan Province 610000 (self assigned number 58) Patentee after: Chengdu Lingshu Yijian Health Technology Co.,Ltd. Country or region after: China Address before: No.24, Section 1, Xuefu Road, Southwest Airport Economic Development Zone, Chengdu, Sichuan 610200 Patentee before: CHENGDU University OF INFORMATION TECHNOLOGY Country or region before: China |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250509 Address after: 17G, Fortune Building, No. 88 Fuhua 3rd Road, Gangxia Community, Futian Street, Shenzhen City, Guangdong Province 518000 Patentee after: Shenzhen Tiancheng Xinneng Cloud Technology Co.,Ltd. Country or region after: China Address before: No. 18 Guanyinqiao West Road, Jinjiang District, Chengdu City, Sichuan Province 610000 (self assigned number 58) Patentee before: Chengdu Lingshu Yijian Health Technology Co.,Ltd. Country or region before: China |
|
TR01 | Transfer of patent right |