JP2025522299A5 - - Google Patents

Info

Publication number
JP2025522299A5
JP2025522299A5 JP2024569371A JP2024569371A JP2025522299A5 JP 2025522299 A5 JP2025522299 A5 JP 2025522299A5 JP 2024569371 A JP2024569371 A JP 2024569371A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2025522299 A5 JP2025522299 A5 JP 2025522299A5
Authority
JP
Japan
Prior art keywords
accelerators
sparse
experts
layers
computing system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2024569371A
Other languages
English (en)
Japanese (ja)
Other versions
JP2025522299A (ja
Filing date
Publication date
Priority claimed from US17/848,679 external-priority patent/US12579470B2/en
Application filed filed Critical
Publication of JP2025522299A publication Critical patent/JP2025522299A/ja
Publication of JP2025522299A5 publication Critical patent/JP2025522299A5/ja
Pending legal-status Critical Current

Links

JP2024569371A 2022-06-24 2023-05-16 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 Pending JP2025522299A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/848,679 US12579470B2 (en) 2022-06-24 2022-06-24 Systems and methods for distributing layers of special mixture-of-experts machine learning models
US17/848,679 2022-06-24
PCT/US2023/022447 WO2023249754A1 (en) 2022-06-24 2023-05-16 Systems and methods for distributing layers of special mixture-of-experts machine learning models

Publications (2)

Publication Number Publication Date
JP2025522299A JP2025522299A (ja) 2025-07-15
JP2025522299A5 true JP2025522299A5 (https=) 2026-04-16

Family

ID=87036771

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2024569371A Pending JP2025522299A (ja) 2022-06-24 2023-05-16 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法

Country Status (6)

Country Link
US (1) US12579470B2 (https=)
EP (1) EP4544451A1 (https=)
JP (1) JP2025522299A (https=)
KR (1) KR20250029051A (https=)
CN (1) CN119452368A (https=)
WO (1) WO2023249754A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117972293B (zh) * 2024-03-28 2024-06-07 北京思凌科半导体技术有限公司 基于混合专家模型的计算方法、装置、设备及存储介质

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018085643A1 (en) * 2016-11-04 2018-05-11 Google Llc Mixture of experts neural networks
US10509846B2 (en) * 2017-12-13 2019-12-17 Intel Corporation Accelerator for processing data
US11893502B2 (en) 2017-12-20 2024-02-06 Advanced Micro Devices, Inc. Dynamic hardware selection for experts in mixture-of-experts model
EP3619654B1 (en) * 2018-07-23 2024-09-04 Google LLC Continuous parametrizations of neural network layer weights
US11003823B2 (en) 2018-08-09 2021-05-11 Palo Alto Research Center Incorporated Re-design of analog circuits
US20200117978A1 (en) 2018-10-12 2020-04-16 Alibaba Group Holding Limited Systems and methods for efficiently mapping neural networks to programmable logic devices
US20220230051A1 (en) 2018-11-18 2022-07-21 Innatera Nanosystems B.V. Spiking Neural Network
US12124941B2 (en) 2020-03-27 2024-10-22 Intel Corporation Methods and apparatus for dynamic batching of data for neural network workloads
US11586894B2 (en) 2020-05-04 2023-02-21 SiMa Technologies, Inc. Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage
US12530571B2 (en) 2020-07-08 2026-01-20 Nvidia Corporation Image generation using one or more neural networks
US20220036186A1 (en) * 2020-07-30 2022-02-03 Waymo Llc Accelerated deep reinforcement learning of agent control policies
US20220059200A1 (en) 2020-08-21 2022-02-24 Washington University Deep-learning systems and methods for medical report generation and anomaly detection
US12518135B2 (en) * 2021-02-05 2026-01-06 Google Llc Sparse and differentiable mixture of experts neural networks
US20230281510A1 (en) * 2022-03-07 2023-09-07 Qualcomm Incorporated Machine learning model architecture combining mixture of experts and model ensembling

Similar Documents

Publication Publication Date Title
JP7078758B2 (ja) 機械学習モデルを改良して局所性を改善させること
CN114730275B (zh) 使用张量在分布式计算系统中进行矢量化资源调度的方法和装置
JP7732050B2 (ja) ニューラルネットワーク計算を加速するためのハードウェア回路
US8400458B2 (en) Method and system for blocking data on a GPU
US8707320B2 (en) Dynamic partitioning of data by occasionally doubling data chunk size for data-parallel applications
US10120717B2 (en) Method for optimizing the size of a data subset of a processing space for improved execution performance
CN116134416B (zh) 避免张量内存布局中存储体冲突和流水线冲突的方法
CN117112145B (zh) 训练模型分配方法、装置、计算机设备和存储介质
CN112559165A (zh) 内存管理方法、装置、电子设备及计算机可读存储介质
US20240273346A1 (en) Self-balancing mixture of experts
CN120670107A (zh) 基于动态拓扑映射的异构计算线程块优化调度方法及系统
US20140196043A1 (en) System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor
Gonthier et al. Memory-aware scheduling of tasks sharing data on multiple GPUs with dynamic runtime systems
CN110795226A (zh) 利用计算机系统处理任务的方法、电子设备和存储介质
Kim et al. LAS: Locality-aware scheduling for GEMM-accelerated convolutions in GPUs
CN119806807A (zh) 在多核系统上执行运算任务的方法、装置及相关产品
CN102831102A (zh) 一种在计算机集群上进行矩阵乘积运算的方法和系统
JP2025522299A5 (https=)
CN108108242B (zh) 基于大数据的存储层智能分发控制方法
CN111860797B (zh) 运算装置
US10402514B2 (en) Modeling and simulation of distributed computing frameworks
US7953816B2 (en) Virtual memory technique for efficiently solving connected problems in a distributed environment
CN119597404B (zh) 图形处理器gpu在容器中的虚拟化方法、装置、设备及存储介质
CN120708921B (zh) 数据结构优化方法、心脏电生理仿真方法、装置、设备及介质
CN116166202B (zh) 一种大数据环境下的副本放置方法、装置、设备及介质