JP2025522299A - 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 - Google Patents
特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法Info
- Publication number
- JP2025522299A JP2025522299A JP2024569371A JP2024569371A JP2025522299A JP 2025522299 A JP2025522299 A JP 2025522299A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2025522299 A JP2025522299 A JP 2025522299A
- Authority
- JP
- Japan
- Prior art keywords
- accelerators
- layers
- sparse
- experts
- dense
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Neurology (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/848,679 US12579470B2 (en) | 2022-06-24 | 2022-06-24 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
| US17/848,679 | 2022-06-24 | ||
| PCT/US2023/022447 WO2023249754A1 (en) | 2022-06-24 | 2023-05-16 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2025522299A true JP2025522299A (ja) | 2025-07-15 |
| JP2025522299A5 JP2025522299A5 (https=) | 2026-04-16 |
Family
ID=87036771
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2024569371A Pending JP2025522299A (ja) | 2022-06-24 | 2023-05-16 | 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US12579470B2 (https=) |
| EP (1) | EP4544451A1 (https=) |
| JP (1) | JP2025522299A (https=) |
| KR (1) | KR20250029051A (https=) |
| CN (1) | CN119452368A (https=) |
| WO (1) | WO2023249754A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117972293B (zh) * | 2024-03-28 | 2024-06-07 | 北京思凌科半导体技术有限公司 | 基于混合专家模型的计算方法、装置、设备及存储介质 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018085643A1 (en) * | 2016-11-04 | 2018-05-11 | Google Llc | Mixture of experts neural networks |
| US10509846B2 (en) * | 2017-12-13 | 2019-12-17 | Intel Corporation | Accelerator for processing data |
| US11893502B2 (en) | 2017-12-20 | 2024-02-06 | Advanced Micro Devices, Inc. | Dynamic hardware selection for experts in mixture-of-experts model |
| EP3619654B1 (en) * | 2018-07-23 | 2024-09-04 | Google LLC | Continuous parametrizations of neural network layer weights |
| US11003823B2 (en) | 2018-08-09 | 2021-05-11 | Palo Alto Research Center Incorporated | Re-design of analog circuits |
| US20200117978A1 (en) | 2018-10-12 | 2020-04-16 | Alibaba Group Holding Limited | Systems and methods for efficiently mapping neural networks to programmable logic devices |
| US20220230051A1 (en) | 2018-11-18 | 2022-07-21 | Innatera Nanosystems B.V. | Spiking Neural Network |
| US12124941B2 (en) | 2020-03-27 | 2024-10-22 | Intel Corporation | Methods and apparatus for dynamic batching of data for neural network workloads |
| US11586894B2 (en) | 2020-05-04 | 2023-02-21 | SiMa Technologies, Inc. | Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage |
| US12530571B2 (en) | 2020-07-08 | 2026-01-20 | Nvidia Corporation | Image generation using one or more neural networks |
| US20220036186A1 (en) * | 2020-07-30 | 2022-02-03 | Waymo Llc | Accelerated deep reinforcement learning of agent control policies |
| US20220059200A1 (en) | 2020-08-21 | 2022-02-24 | Washington University | Deep-learning systems and methods for medical report generation and anomaly detection |
| US12518135B2 (en) * | 2021-02-05 | 2026-01-06 | Google Llc | Sparse and differentiable mixture of experts neural networks |
| US20230281510A1 (en) * | 2022-03-07 | 2023-09-07 | Qualcomm Incorporated | Machine learning model architecture combining mixture of experts and model ensembling |
-
2022
- 2022-06-24 US US17/848,679 patent/US12579470B2/en active Active
-
2023
- 2023-05-16 CN CN202380044796.8A patent/CN119452368A/zh active Pending
- 2023-05-16 WO PCT/US2023/022447 patent/WO2023249754A1/en not_active Ceased
- 2023-05-16 EP EP23734783.6A patent/EP4544451A1/en active Pending
- 2023-05-16 JP JP2024569371A patent/JP2025522299A/ja active Pending
- 2023-05-16 KR KR1020247042066A patent/KR20250029051A/ko active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| CN119452368A (zh) | 2025-02-14 |
| US12579470B2 (en) | 2026-03-17 |
| KR20250029051A (ko) | 2025-03-04 |
| EP4544451A1 (en) | 2025-04-30 |
| US20230419166A1 (en) | 2023-12-28 |
| WO2023249754A1 (en) | 2023-12-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110262901A (zh) | 一种数据处理方法及数据处理系统 | |
| CN110689121A (zh) | 一种用多核处理器实现神经网络模型拆分方法及相关产品 | |
| CN106055311B (zh) | 基于流水线多线程的MapReduce任务并行化方法 | |
| CN104778077B (zh) | 基于随机和连续磁盘访问的高速核外图处理方法及系统 | |
| US20240273346A1 (en) | Self-balancing mixture of experts | |
| Nagrecha | Systems for parallel and distributed large-model deep learning training | |
| CN119783812B (zh) | 面向新一代异构超算大模型并行训练与推理适配优化方法 | |
| Liu et al. | Meta-mapreduce for scalable data mining | |
| WO2025081828A1 (zh) | 训练模型分配方法、装置、计算机设备和存储介质 | |
| CN115205092A (zh) | 使用访问请求响应动态批处理组件的图形执行 | |
| US11194625B2 (en) | Systems and methods for accelerating data operations by utilizing native memory management | |
| Xie et al. | Optimal distributed parallel algorithms for deep learning framework tensorflow | |
| Kim et al. | Comprehensive techniques of multi-GPU memory optimization for deep learning acceleration | |
| CN118446265A (zh) | 神经网络加速器设计方法及装置 | |
| Liu et al. | G-learned index: Enabling efficient learned index on GPU | |
| JP2025522299A (ja) | 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 | |
| CN117786412A (zh) | 大型语言模型的弹性训练方法、集群系统、产品及介质 | |
| CN116400926A (zh) | 面向人工智能芯片的标量引擎处理方法和装置 | |
| Liu et al. | Parallelization Techniques for Large Language Models: A Review from Training to Inference | |
| Xu et al. | EdgeMesh: A hybrid distributed training mechanism for heterogeneous edge devices | |
| CN120996216B (zh) | 动态路由混合专家模型的推理方法、系统、设备及介质 | |
| Chiu et al. | Design and implementation of the CNN accelator based on multi-streaming SIMD mechanisms | |
| Dong et al. | Slope: structural locality-aware programming model for composing array data analysis | |
| US20260044427A1 (en) | Automatic parallel execution of artificial intelligence workloads | |
| Zhang et al. | A distributed PCM clustering algorithm based on spark |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| RD03 | Notification of appointment of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7423 Effective date: 20250602 |
|
| RD04 | Notification of resignation of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7424 Effective date: 20250605 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20260407 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20260407 |