JP2025522299A5 - - Google Patents
Info
- Publication number
- JP2025522299A5 JP2025522299A5 JP2024569371A JP2024569371A JP2025522299A5 JP 2025522299 A5 JP2025522299 A5 JP 2025522299A5 JP 2024569371 A JP2024569371 A JP 2024569371A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2025522299 A5 JP2025522299 A5 JP 2025522299A5
- Authority
- JP
- Japan
- Prior art keywords
- accelerators
- sparse
- experts
- layers
- computing system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/848,679 US12579470B2 (en) | 2022-06-24 | 2022-06-24 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
| US17/848,679 | 2022-06-24 | ||
| PCT/US2023/022447 WO2023249754A1 (en) | 2022-06-24 | 2023-05-16 | Systems and methods for distributing layers of special mixture-of-experts machine learning models |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2025522299A JP2025522299A (ja) | 2025-07-15 |
| JP2025522299A5 true JP2025522299A5 (https=) | 2026-04-16 |
Family
ID=87036771
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2024569371A Pending JP2025522299A (ja) | 2022-06-24 | 2023-05-16 | 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US12579470B2 (https=) |
| EP (1) | EP4544451A1 (https=) |
| JP (1) | JP2025522299A (https=) |
| KR (1) | KR20250029051A (https=) |
| CN (1) | CN119452368A (https=) |
| WO (1) | WO2023249754A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117972293B (zh) * | 2024-03-28 | 2024-06-07 | 北京思凌科半导体技术有限公司 | 基于混合专家模型的计算方法、装置、设备及存储介质 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018085643A1 (en) * | 2016-11-04 | 2018-05-11 | Google Llc | Mixture of experts neural networks |
| US10509846B2 (en) * | 2017-12-13 | 2019-12-17 | Intel Corporation | Accelerator for processing data |
| US11893502B2 (en) | 2017-12-20 | 2024-02-06 | Advanced Micro Devices, Inc. | Dynamic hardware selection for experts in mixture-of-experts model |
| EP3619654B1 (en) * | 2018-07-23 | 2024-09-04 | Google LLC | Continuous parametrizations of neural network layer weights |
| US11003823B2 (en) | 2018-08-09 | 2021-05-11 | Palo Alto Research Center Incorporated | Re-design of analog circuits |
| US20200117978A1 (en) | 2018-10-12 | 2020-04-16 | Alibaba Group Holding Limited | Systems and methods for efficiently mapping neural networks to programmable logic devices |
| US20220230051A1 (en) | 2018-11-18 | 2022-07-21 | Innatera Nanosystems B.V. | Spiking Neural Network |
| US12124941B2 (en) | 2020-03-27 | 2024-10-22 | Intel Corporation | Methods and apparatus for dynamic batching of data for neural network workloads |
| US11586894B2 (en) | 2020-05-04 | 2023-02-21 | SiMa Technologies, Inc. | Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage |
| US12530571B2 (en) | 2020-07-08 | 2026-01-20 | Nvidia Corporation | Image generation using one or more neural networks |
| US20220036186A1 (en) * | 2020-07-30 | 2022-02-03 | Waymo Llc | Accelerated deep reinforcement learning of agent control policies |
| US20220059200A1 (en) | 2020-08-21 | 2022-02-24 | Washington University | Deep-learning systems and methods for medical report generation and anomaly detection |
| US12518135B2 (en) * | 2021-02-05 | 2026-01-06 | Google Llc | Sparse and differentiable mixture of experts neural networks |
| US20230281510A1 (en) * | 2022-03-07 | 2023-09-07 | Qualcomm Incorporated | Machine learning model architecture combining mixture of experts and model ensembling |
-
2022
- 2022-06-24 US US17/848,679 patent/US12579470B2/en active Active
-
2023
- 2023-05-16 CN CN202380044796.8A patent/CN119452368A/zh active Pending
- 2023-05-16 WO PCT/US2023/022447 patent/WO2023249754A1/en not_active Ceased
- 2023-05-16 EP EP23734783.6A patent/EP4544451A1/en active Pending
- 2023-05-16 JP JP2024569371A patent/JP2025522299A/ja active Pending
- 2023-05-16 KR KR1020247042066A patent/KR20250029051A/ko active Pending
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7078758B2 (ja) | 機械学習モデルを改良して局所性を改善させること | |
| CN114730275B (zh) | 使用张量在分布式计算系统中进行矢量化资源调度的方法和装置 | |
| JP7732050B2 (ja) | ニューラルネットワーク計算を加速するためのハードウェア回路 | |
| US8400458B2 (en) | Method and system for blocking data on a GPU | |
| US8707320B2 (en) | Dynamic partitioning of data by occasionally doubling data chunk size for data-parallel applications | |
| US10120717B2 (en) | Method for optimizing the size of a data subset of a processing space for improved execution performance | |
| CN116134416B (zh) | 避免张量内存布局中存储体冲突和流水线冲突的方法 | |
| CN117112145B (zh) | 训练模型分配方法、装置、计算机设备和存储介质 | |
| CN112559165A (zh) | 内存管理方法、装置、电子设备及计算机可读存储介质 | |
| US20240273346A1 (en) | Self-balancing mixture of experts | |
| CN120670107A (zh) | 基于动态拓扑映射的异构计算线程块优化调度方法及系统 | |
| US20140196043A1 (en) | System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor | |
| Gonthier et al. | Memory-aware scheduling of tasks sharing data on multiple GPUs with dynamic runtime systems | |
| CN110795226A (zh) | 利用计算机系统处理任务的方法、电子设备和存储介质 | |
| Kim et al. | LAS: Locality-aware scheduling for GEMM-accelerated convolutions in GPUs | |
| CN119806807A (zh) | 在多核系统上执行运算任务的方法、装置及相关产品 | |
| CN102831102A (zh) | 一种在计算机集群上进行矩阵乘积运算的方法和系统 | |
| JP2025522299A5 (https=) | ||
| CN108108242B (zh) | 基于大数据的存储层智能分发控制方法 | |
| CN111860797B (zh) | 运算装置 | |
| US10402514B2 (en) | Modeling and simulation of distributed computing frameworks | |
| US7953816B2 (en) | Virtual memory technique for efficiently solving connected problems in a distributed environment | |
| CN119597404B (zh) | 图形处理器gpu在容器中的虚拟化方法、装置、设备及存储介质 | |
| CN120708921B (zh) | 数据结构优化方法、心脏电生理仿真方法、装置、设备及介质 | |
| CN116166202B (zh) | 一种大数据环境下的副本放置方法、装置、设备及介质 |