CN114626540A - 处理器和相关产品 - Google Patents
处理器和相关产品 Download PDFInfo
- Publication number
- CN114626540A CN114626540A CN202011448594.4A CN202011448594A CN114626540A CN 114626540 A CN114626540 A CN 114626540A CN 202011448594 A CN202011448594 A CN 202011448594A CN 114626540 A CN114626540 A CN 114626540A
- Authority
- CN
- China
- Prior art keywords
- execution unit
- threaded
- data
- vector
- vector execution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000013598 vector Substances 0.000 claims abstract description 367
- 238000012545 processing Methods 0.000 claims abstract description 51
- 238000004364 calculation method Methods 0.000 claims description 122
- 238000000034 method Methods 0.000 claims description 41
- 238000004590 computer program Methods 0.000 claims description 7
- 238000003672 processing method Methods 0.000 claims description 5
- 230000003993 interaction Effects 0.000 abstract description 4
- 239000011159 matrix material Substances 0.000 description 31
- 238000010586 diagram Methods 0.000 description 19
- 238000012546 transfer Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 12
- 238000007667 floating Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 8
- 238000013461 design Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000013473 artificial intelligence Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000008707 rearrangement Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
- G06F9/30038—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Advance Control (AREA)
Abstract
Description
Claims (18)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011448594.4A CN114626540A (zh) | 2020-12-11 | 2020-12-11 | 处理器和相关产品 |
PCT/CN2021/101025 WO2022121275A1 (zh) | 2020-12-11 | 2021-06-18 | 处理器、多线程处理方法、电子设备以及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011448594.4A CN114626540A (zh) | 2020-12-11 | 2020-12-11 | 处理器和相关产品 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114626540A true CN114626540A (zh) | 2022-06-14 |
Family
ID=81895669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011448594.4A Pending CN114626540A (zh) | 2020-12-11 | 2020-12-11 | 处理器和相关产品 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114626540A (zh) |
WO (1) | WO2022121275A1 (zh) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8698817B2 (en) * | 2004-11-15 | 2014-04-15 | Nvidia Corporation | Video processor having scalar and vector components |
CN105373367B (zh) * | 2015-10-29 | 2018-03-02 | 中国人民解放军国防科学技术大学 | 支持标向量协同工作的向量simd运算结构 |
US20170132003A1 (en) * | 2015-11-10 | 2017-05-11 | Futurewei Technologies, Inc. | System and Method for Hardware Multithreading to Improve VLIW DSP Performance and Efficiency |
CN110503179B (zh) * | 2018-05-18 | 2024-03-01 | 上海寒武纪信息科技有限公司 | 计算方法以及相关产品 |
-
2020
- 2020-12-11 CN CN202011448594.4A patent/CN114626540A/zh active Pending
-
2021
- 2021-06-18 WO PCT/CN2021/101025 patent/WO2022121275A1/zh active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2022121275A1 (zh) | 2022-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9830156B2 (en) | Temporal SIMT execution optimization through elimination of redundant operations | |
TWI628594B (zh) | 用戶等級分叉及會合處理器、方法、系統及指令 | |
Garland et al. | Understanding throughput-oriented architectures | |
US7925860B1 (en) | Maximized memory throughput using cooperative thread arrays | |
US8412917B2 (en) | Data exchange and communication between execution units in a parallel processor | |
EP2480979B1 (en) | Unanimous branch instructions in a parallel thread processor | |
CN111310910A (zh) | 一种计算装置及方法 | |
US20210368656A1 (en) | Intelligent control and distribution of a liquid in a data center | |
US20070130447A1 (en) | System and method for processing thread groups in a SIMD architecture | |
US11895808B2 (en) | Intelligent refrigeration-assisted data center liquid cooling | |
US8572355B2 (en) | Support for non-local returns in parallel thread SIMD engine | |
Khairy et al. | A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity | |
US9569211B2 (en) | Predication in a vector processor | |
US8413151B1 (en) | Selective thread spawning within a multi-threaded processing system | |
CN114626540A (zh) | 处理器和相关产品 | |
Soliman | Mat-core: A matrix core extension for general-purpose processors | |
US11822541B2 (en) | Techniques for storing sub-alignment data when accelerating Smith-Waterman sequence alignments | |
US11550584B1 (en) | Implementing specialized instructions for accelerating Smith-Waterman sequence alignments | |
US20230101085A1 (en) | Techniques for accelerating smith-waterman sequence alignments | |
US20230305844A1 (en) | Implementing specialized instructions for accelerating dynamic programming algorithms | |
US11416261B2 (en) | Group load register of a graph streaming processor | |
Soliman et al. | Exploiting ILP, DLP, TLP, and MPI to accelerate matrix multiplication on Xeon processors | |
Soliman | Mat-core: a decoupled matrix core extension for general-purpose processors | |
Raju et al. | Performance enhancement of CUDA applications by overlapping data transfer and Kernel execution | |
CN117437113A (zh) | 用于对图像数据进行加速处理的系统、方法及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Wang Wenqiang Inventor after: Sun Haitao Inventor after: Zhang Qirong Inventor after: Zhu Zhiqi Inventor after: Xu Ningyi Inventor before: Sun Haitao Inventor before: Wang Wenqiang Inventor before: Zhang Qirong Inventor before: Zhu Zhiqi Inventor before: Xu Ningyi |
|
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40067406 Country of ref document: HK |