CN119452368A - 用于分布专用混合专家机器学习模型的层的系统和方法 - Google Patents
用于分布专用混合专家机器学习模型的层的系统和方法 Download PDFInfo
- Publication number
- CN119452368A CN119452368A CN202380044796.8A CN202380044796A CN119452368A CN 119452368 A CN119452368 A CN 119452368A CN 202380044796 A CN202380044796 A CN 202380044796A CN 119452368 A CN119452368 A CN 119452368A
- Authority
- CN
- China
- Prior art keywords
- accelerators
- layers
- sparse
- experts
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Neurology (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/848,679 US12579470B2 (en) | 2022-06-24 | 2022-06-24 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
| US17/848,679 | 2022-06-24 | ||
| PCT/US2023/022447 WO2023249754A1 (en) | 2022-06-24 | 2023-05-16 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN119452368A true CN119452368A (zh) | 2025-02-14 |
Family
ID=87036771
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202380044796.8A Pending CN119452368A (zh) | 2022-06-24 | 2023-05-16 | 用于分布专用混合专家机器学习模型的层的系统和方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US12579470B2 (https=) |
| EP (1) | EP4544451A1 (https=) |
| JP (1) | JP2025522299A (https=) |
| KR (1) | KR20250029051A (https=) |
| CN (1) | CN119452368A (https=) |
| WO (1) | WO2023249754A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117972293B (zh) * | 2024-03-28 | 2024-06-07 | 北京思凌科半导体技术有限公司 | 基于混合专家模型的计算方法、装置、设备及存储介质 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018085643A1 (en) * | 2016-11-04 | 2018-05-11 | Google Llc | Mixture of experts neural networks |
| US10509846B2 (en) * | 2017-12-13 | 2019-12-17 | Intel Corporation | Accelerator for processing data |
| US11893502B2 (en) | 2017-12-20 | 2024-02-06 | Advanced Micro Devices, Inc. | Dynamic hardware selection for experts in mixture-of-experts model |
| EP3619654B1 (en) * | 2018-07-23 | 2024-09-04 | Google LLC | Continuous parametrizations of neural network layer weights |
| US11003823B2 (en) | 2018-08-09 | 2021-05-11 | Palo Alto Research Center Incorporated | Re-design of analog circuits |
| US20200117978A1 (en) | 2018-10-12 | 2020-04-16 | Alibaba Group Holding Limited | Systems and methods for efficiently mapping neural networks to programmable logic devices |
| US20220230051A1 (en) | 2018-11-18 | 2022-07-21 | Innatera Nanosystems B.V. | Spiking Neural Network |
| US12124941B2 (en) | 2020-03-27 | 2024-10-22 | Intel Corporation | Methods and apparatus for dynamic batching of data for neural network workloads |
| US11586894B2 (en) | 2020-05-04 | 2023-02-21 | SiMa Technologies, Inc. | Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage |
| US12530571B2 (en) | 2020-07-08 | 2026-01-20 | Nvidia Corporation | Image generation using one or more neural networks |
| US20220036186A1 (en) * | 2020-07-30 | 2022-02-03 | Waymo Llc | Accelerated deep reinforcement learning of agent control policies |
| US20220059200A1 (en) | 2020-08-21 | 2022-02-24 | Washington University | Deep-learning systems and methods for medical report generation and anomaly detection |
| US12518135B2 (en) * | 2021-02-05 | 2026-01-06 | Google Llc | Sparse and differentiable mixture of experts neural networks |
| US20230281510A1 (en) * | 2022-03-07 | 2023-09-07 | Qualcomm Incorporated | Machine learning model architecture combining mixture of experts and model ensembling |
-
2022
- 2022-06-24 US US17/848,679 patent/US12579470B2/en active Active
-
2023
- 2023-05-16 CN CN202380044796.8A patent/CN119452368A/zh active Pending
- 2023-05-16 WO PCT/US2023/022447 patent/WO2023249754A1/en not_active Ceased
- 2023-05-16 EP EP23734783.6A patent/EP4544451A1/en active Pending
- 2023-05-16 JP JP2024569371A patent/JP2025522299A/ja active Pending
- 2023-05-16 KR KR1020247042066A patent/KR20250029051A/ko active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| US12579470B2 (en) | 2026-03-17 |
| KR20250029051A (ko) | 2025-03-04 |
| EP4544451A1 (en) | 2025-04-30 |
| US20230419166A1 (en) | 2023-12-28 |
| WO2023249754A1 (en) | 2023-12-28 |
| JP2025522299A (ja) | 2025-07-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110262901B (zh) | 一种数据处理方法及数据处理系统 | |
| US20230076850A1 (en) | Computation of neural network node with large input values | |
| CN110689121A (zh) | 一种用多核处理器实现神经网络模型拆分方法及相关产品 | |
| CN111105023B (zh) | 数据流重构方法及可重构数据流处理器 | |
| CN117112145B (zh) | 训练模型分配方法、装置、计算机设备和存储介质 | |
| US20240273346A1 (en) | Self-balancing mixture of experts | |
| CN119783812B (zh) | 面向新一代异构超算大模型并行训练与推理适配优化方法 | |
| CN119046015B (zh) | 神经网络模型训练处理的电子设备、方法和介质 | |
| CN119452368A (zh) | 用于分布专用混合专家机器学习模型的层的系统和方法 | |
| Jiang et al. | Exploiting potential of deep neural networks by layer-wise fine-grained parallelism | |
| Zhan et al. | Field programmable gate array‐based all‐layer accelerator with quantization neural networks for sustainable cyber‐physical systems | |
| CN117114055B (zh) | 面向工业应用场景的fpga二值神经网络加速方法 | |
| JP2023024960A (ja) | 効率的なニューラルネットワーク実行のためのメモリ使用の最適化 | |
| Xu et al. | EdgeMesh: A hybrid distributed training mechanism for heterogeneous edge devices | |
| Liu et al. | Parallelization Techniques for Large Language Models: A Review from Training to Inference | |
| CN121188009B (zh) | 向分布式模型提供数据的方法、装置、设备和存储介质 | |
| Chiu et al. | Design and implementation of the CNN accelator based on multi-streaming SIMD mechanisms | |
| CN121523863B (zh) | 模型调度方法、装置、计算机设备及存储介质 | |
| US12373261B2 (en) | Just-in-time re-partitioning of feature maps for efficient balancing of compute core workloads | |
| US20260044380A1 (en) | Mixed parallelism for execution of artificial intelligence workloads | |
| US20250124347A1 (en) | Training model allocation method, apparatus, computer device, and storage medium | |
| Karypis | High-Performance Static and Streaming Tensor Factorization Algorithms | |
| Aishwarya et al. | Parallel Implementation of Dutch Flag Sorting Algorithm Using MPI and CUDA | |
| CN120124673A (zh) | 面向图神经网络处理器模拟框架的加速方法及系统 | |
| Asadikouhanjani | Design of Efficient DNN Accelerator Architectures |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |