JP2025522299A - 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 - Google Patents

特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法

Info

Publication number
JP2025522299A
JP2025522299A JP2024569371A JP2024569371A JP2025522299A JP 2025522299 A JP2025522299 A JP 2025522299A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2024569371 A JP2024569371 A JP 2024569371A JP 2025522299 A JP2025522299 A JP 2025522299A
Authority
JP
Japan
Prior art keywords
accelerators
layers
sparse
experts
dense
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2024569371A
Other languages
English (en)
Japanese (ja)
Other versions
JP2025522299A5 (https=
Inventor
パテル,デヴァングクマール・ラメッシバーイ
ツォ,ウェイ
ユ,ユアン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Publication of JP2025522299A publication Critical patent/JP2025522299A/ja
Publication of JP2025522299A5 publication Critical patent/JP2025522299A5/ja
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Neurology (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Complex Calculations (AREA)
JP2024569371A 2022-06-24 2023-05-16 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法 Pending JP2025522299A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/848,679 US12579470B2 (en) 2022-06-24 2022-06-24 Systems and methods for distributing layers of special mixture-of-experts machine learning models
US17/848,679 2022-06-24
PCT/US2023/022447 WO2023249754A1 (en) 2022-06-24 2023-05-16 Systems and methods for distributing layers of special mixture-of-experts machine learning models

Publications (2)

Publication Number Publication Date
JP2025522299A true JP2025522299A (ja) 2025-07-15
JP2025522299A5 JP2025522299A5 (https=) 2026-04-16

Family

ID=87036771

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2024569371A Pending JP2025522299A (ja) 2022-06-24 2023-05-16 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法

Country Status (6)

Country Link
US (1) US12579470B2 (https=)
EP (1) EP4544451A1 (https=)
JP (1) JP2025522299A (https=)
KR (1) KR20250029051A (https=)
CN (1) CN119452368A (https=)
WO (1) WO2023249754A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117972293B (zh) * 2024-03-28 2024-06-07 北京思凌科半导体技术有限公司 基于混合专家模型的计算方法、装置、设备及存储介质

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018085643A1 (en) * 2016-11-04 2018-05-11 Google Llc Mixture of experts neural networks
US10509846B2 (en) * 2017-12-13 2019-12-17 Intel Corporation Accelerator for processing data
US11893502B2 (en) 2017-12-20 2024-02-06 Advanced Micro Devices, Inc. Dynamic hardware selection for experts in mixture-of-experts model
EP3619654B1 (en) * 2018-07-23 2024-09-04 Google LLC Continuous parametrizations of neural network layer weights
US11003823B2 (en) 2018-08-09 2021-05-11 Palo Alto Research Center Incorporated Re-design of analog circuits
US20200117978A1 (en) 2018-10-12 2020-04-16 Alibaba Group Holding Limited Systems and methods for efficiently mapping neural networks to programmable logic devices
US20220230051A1 (en) 2018-11-18 2022-07-21 Innatera Nanosystems B.V. Spiking Neural Network
US12124941B2 (en) 2020-03-27 2024-10-22 Intel Corporation Methods and apparatus for dynamic batching of data for neural network workloads
US11586894B2 (en) 2020-05-04 2023-02-21 SiMa Technologies, Inc. Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage
US12530571B2 (en) 2020-07-08 2026-01-20 Nvidia Corporation Image generation using one or more neural networks
US20220036186A1 (en) * 2020-07-30 2022-02-03 Waymo Llc Accelerated deep reinforcement learning of agent control policies
US20220059200A1 (en) 2020-08-21 2022-02-24 Washington University Deep-learning systems and methods for medical report generation and anomaly detection
US12518135B2 (en) * 2021-02-05 2026-01-06 Google Llc Sparse and differentiable mixture of experts neural networks
US20230281510A1 (en) * 2022-03-07 2023-09-07 Qualcomm Incorporated Machine learning model architecture combining mixture of experts and model ensembling

Also Published As

Publication number Publication date
CN119452368A (zh) 2025-02-14
US12579470B2 (en) 2026-03-17
KR20250029051A (ko) 2025-03-04
EP4544451A1 (en) 2025-04-30
US20230419166A1 (en) 2023-12-28
WO2023249754A1 (en) 2023-12-28

Similar Documents

Publication Publication Date Title
CN110262901A (zh) 一种数据处理方法及数据处理系统
CN110689121A (zh) 一种用多核处理器实现神经网络模型拆分方法及相关产品
CN106055311B (zh) 基于流水线多线程的MapReduce任务并行化方法
CN104778077B (zh) 基于随机和连续磁盘访问的高速核外图处理方法及系统
US20240273346A1 (en) Self-balancing mixture of experts
Nagrecha Systems for parallel and distributed large-model deep learning training
CN119783812B (zh) 面向新一代异构超算大模型并行训练与推理适配优化方法
Liu et al. Meta-mapreduce for scalable data mining
WO2025081828A1 (zh) 训练模型分配方法、装置、计算机设备和存储介质
CN115205092A (zh) 使用访问请求响应动态批处理组件的图形执行
US11194625B2 (en) Systems and methods for accelerating data operations by utilizing native memory management
Xie et al. Optimal distributed parallel algorithms for deep learning framework tensorflow
Kim et al. Comprehensive techniques of multi-GPU memory optimization for deep learning acceleration
CN118446265A (zh) 神经网络加速器设计方法及装置
Liu et al. G-learned index: Enabling efficient learned index on GPU
JP2025522299A (ja) 特殊混合専門家機械学習モデルの層を分散するためのシステムおよび方法
CN117786412A (zh) 大型语言模型的弹性训练方法、集群系统、产品及介质
CN116400926A (zh) 面向人工智能芯片的标量引擎处理方法和装置
Liu et al. Parallelization Techniques for Large Language Models: A Review from Training to Inference
Xu et al. EdgeMesh: A hybrid distributed training mechanism for heterogeneous edge devices
CN120996216B (zh) 动态路由混合专家模型的推理方法、系统、设备及介质
Chiu et al. Design and implementation of the CNN accelator based on multi-streaming SIMD mechanisms
Dong et al. Slope: structural locality-aware programming model for composing array data analysis
US20260044427A1 (en) Automatic parallel execution of artificial intelligence workloads
Zhang et al. A distributed PCM clustering algorithm based on spark

Legal Events

Date Code Title Description
RD03 Notification of appointment of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7423

Effective date: 20250602

RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20250605

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20260407

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20260407