CN101930358B - Data processing method on single instruction multiple data (SIMD) structure and processor - Google Patents

Data processing method on single instruction multiple data (SIMD) structure and processor Download PDF

Info

Publication number
CN101930358B
CN101930358B CN 201010261763 CN201010261763A CN101930358B CN 101930358 B CN101930358 B CN 101930358B CN 201010261763 CN201010261763 CN 201010261763 CN 201010261763 A CN201010261763 A CN 201010261763A CN 101930358 B CN101930358 B CN 101930358B
Authority
CN
China
Prior art keywords
instruction
predicate
data stream
value
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010261763
Other languages
Chinese (zh)
Other versions
CN101930358A (en
Inventor
安虹
许牧
徐光�
刘谷
李颀
任永青
李小强
孙涛
郝秀蕊
周伟
谭旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN 201010261763 priority Critical patent/CN101930358B/en
Publication of CN101930358A publication Critical patent/CN101930358A/en
Application granted granted Critical
Publication of CN101930358B publication Critical patent/CN101930358B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a data processing method on a single instruction multiple data (SIMD) structure and a processor. The method comprises the following steps of: selecting an multiple data stream processing instruction meeting a condition, wherein the instruction comprises a predication region comprising a flag bit and an index bit; encoding the instruction and acquiring values of the flag bit and the index bit; judging whether the instruction is a predication instruction by using the value of the flag bit; when the instruction is the predication instruction, reading a predication in a table entry corresponding to the value of the index bit in a preset predication register; uniformly distributing the predication to the multiple data streams; respectively comparing the value of the flag bit with that of the predication corresponding to each data stream; determining that the data having the same comparison result is processable; and executing the instruction to process the data. The data processing method on the SIMD structure of the invention solves the problems of the energy consumption and waste caused by processing the data not needed to be processed and low processing efficiency by introducing a predication execution way and judging whether to execute the instruction to process the data stream set by the comparison result of a predication flag and the prediction.

Description

A kind of data processing method on single instruction multiple data (SIMD) structure and processor
Technical field
The present invention relates to the microprocessor technology field, relate in particular to a kind of data processing method on single instruction multiple data (SIMD) structure and processor.
Background technology
SIMD(Single Instruction Multiple Data, single-instruction multiple-data stream (SIMD)) technology is a kind of technology of exploitation fine-grained data concurrency of widespread use.The core concept of SIMD technology is on a plurality of performance elements, the identical instruction sequence of different the data to be processed, thereby once obtains the result of calculation after a plurality of processing, to improve counting yield.The SIMD technology is mainly used in the high-performance computer field of cabinet-level in early days, the ILLIAC IV type machine of university as vertical in Illinois, America development.Along with the development of semiconductor technology, chip internal can integrated increasing transistor, and people begin the SIMD technology is used in processor inside to develop fine-grained data parallelism.
In traditional SIMD microprocessor, usually do not provide the support to conditional branch instructions.When running into conditional branch instructions, need the programmer to adopt the way of manual redundant computation.As shown in Figure 1, one section branch code is when carrying out branch instruction, and the SIMD processor will be carried out left and right two parts individual path, distinguishes with two kinds of lines of actual situation respectively in the drawings.After completing individual path calculating, then calculate corresponding branch condition, determine the concrete result of calculation of submitting which branch to according to branch condition.Because the left and right individual path all need to be carried out on processor, after obtaining result of calculation, submit in the result of calculation that presentation stage is selected to meet branch condition, the result that does not meet branch condition is rejected, therefore produce a large amount of useless calculating, caused power wastage, carried out efficient lower.
Summary of the invention
In view of this, the invention provides the structural data processing method of a kind of SIMD and processor, with power wastage in the structural execution process instruction of SIMD in the solution prior art, carry out the lower problem of efficient.Its concrete scheme is:
The data processing method of a kind of single instruction multiple data stream organization SIMD comprises:
Choose qualified instruction process multi-group data stream, described instruction has the predicate territory that comprises marker bit and index bit;
Decoding is carried out in described instruction, and obtained the value of described marker bit and index bit;
Utilize the value of described marker bit to judge whether described instruction is the predicate instruction;
When described instruction is the predicate instruction, search list item corresponding with it in predicate register file according to the value of index bit, read the predicate in list item corresponding with described index place value in default predicate register file;
Give described multi-group data stream with described predicate uniform distribution;
The value of the predicate that more described marker bit is corresponding with each group data stream respectively;
Determine that comparative result is but that identical data stream is data streams;
But carry out the described data streams of instruction process.
Preferably, also comprise:
Determine that comparative result is but that different data stream is non-data streams;
But carry out non-operation instruction and process described non-data streams.
Preferably, also comprise:
For not simultaneously, stop the processing to its corresponding data stream when described comparative result.
Preferably, when described instruction is non-predicate instruction, directly carry out described instruction process multi-group data stream.
A kind of SIMD processor comprises:
The unit is chosen in instruction, is used for choosing qualified instruction process multi-group data stream, and described instruction has the predicate territory that comprises marker bit and index bit;
Decoding unit is used for decoding is carried out in described instruction, and obtains the value of described marker bit and index bit;
Judging unit is used for utilizing the value of described marker bit to judge whether described instruction is the predicate instruction;
Predicate register file is used for depositing predicate;
Reading unit is used for searching list item corresponding with it in predicate register file according to the value of index bit when described instruction is the predicate instruction, reads the predicate in list item corresponding with described index bit in default predicate register file;
Allocation units are used for described predicate uniform distribution to described multi-group data stream;
Comparing unit is for the value of the predicate that more described marker bit is corresponding with each group data stream respectively;
Determining unit, but be used for determining that comparative result is that identical data stream is data streams;
Performance element, but be used for carrying out the described data streams of instruction process.
Preferably, but described determining unit be used for to determine that also comparative result is that different data stream is non-data streams; But described performance element also is used for carrying out non-operation instruction processes described non-data streams.
The structural data processing method of SIMD disclosed by the invention is introduced the predicated execution mode, utilize the comparative result of predicate mark and predicate to judge whether to need to carry out instruction treatmenting data stream group, then only carrying out comparative result is identical data stream corresponding to instruction, avoided because process the power wastage that does not need processed data stream to cause, and the low problem for the treatment of effeciency.Can process efficiently irregular control stream, further enlarge the scope of application of SIMD structure.
Description of drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or description of the Prior Art, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 carries out the process flow diagram of branch instruction on SIMD structure of the prior art;
Fig. 2 is the disclosed data processing method process flow diagram of embodiment 1;
Fig. 3 is the process flow diagram of embodiment 2 disclosed data processing methods;
Fig. 4 is the disclosed predicate domain structure of embodiment 2 schematic diagram;
Fig. 5 is the disclosed predicate marker bit definition of embodiment 2 schematic diagram;
Fig. 6 is the comparison rule schematic diagram of the disclosed marker bit of embodiment 2 and predicate;
Fig. 7 is the structural representation of SIMD processor disclosed by the invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
The predicated execution technology is a kind of relevant technology of control between instruction that solves, and by introducing logical predicate, with the instruction of controlling in relevant and certain branch, the data that are converted to this branch condition (being predicate) and corresponding data are relevant.Traditional predicated execution technology is relevant by the control of eliminating between a plurality of fundamental blocks, and the tie compiler device merges a plurality of fundamental blocks becomes an inside without controlling relevant super piece, has realized excavating the purpose of more instruction-level parallelism.Produce the Boolean type predicate of the different individual paths of a plurality of representatives after the branch instruction predicate, the instruction that is arranged in branch has the predicate territory, only has when corresponding predicate condition is satisfied in predicate territory corresponding to instruction, and instruction is just carried out.The predicated execution technology can rely on the data dependence that is converted to predicate with the control between instruction.
The present invention discloses the structural data processing method of a kind of SIMD according to the predicated execution technology, and its embodiment is as follows:
Embodiment one
The flow process of the structural data processing method of the disclosed SIMD of the present embodiment comprises as shown in Figure 2:
Step S21, choose qualified instruction process multi-group data stream, described instruction has the predicate territory that comprises marker bit and index bit;
Predicate territory in instruction is set in advance by compiler, when compiler is the scale-of-two machine code in the program translation that will finish writing, is predicate numbering of each finger assignments automatically, and this numbering is filled in the predicate territory of instruction of branch's association therewith.In the process that program is carried out, calculate the value of corresponding predicate by computations, and by the predicate register file write command, predicate is write in predefined predicate register file.Each list item of predicate register file has one and writes the position, complete writing of described list item fashionable when predicate register file writes instruction, simultaneously will be with the writing position position of corresponding list item, realize according to the hardware of reality, it can be set to 1 or 0, with the expression predicate register file this value of corresponding predicate has been arranged.The instruction that is selected need to meet the condition that operand has been satisfied simultaneously, and predicate is calculated accordingly, and is written to the condition in predicate register file.
Step S22, decoding is carried out in described instruction, and obtained the value of marker bit and index bit;
Step S23, utilize the value of described marker bit to judge whether described instruction is the predicate instruction;
Step S24, when described instruction is the predicate instruction, read the predicate in list item corresponding with described index place value in default predicate register file;
Step S25, give described multi-group data stream with described predicate uniform distribution;
Whether the value of step S26, more described marker bit and each group data stream are corresponding respectively predicate is identical;
Step S27, determine that comparative result is but that identical data stream is data streams;
But step S28, the described data streams of execution instruction process.
Can find out from above-mentioned steps, the structural data processing method of SIMD disclosed by the invention is by introducing the predicated execution mode, utilize the comparative result of predicate mark and predicate to judge whether to need to carry out instruction treatmenting data stream group, then only carrying out comparative result is identical data stream corresponding to instruction, avoided because process the power wastage that does not need processed data stream to cause, and the low problem for the treatment of effeciency.
Embodiment two
Comparatively detailed data processing method disclosed by the invention is described in the present embodiment, its flow process as shown in Figure 3,
Step S31, choose qualified instruction process multi-group data stream, described instruction has the predicate territory that comprises marker bit and index bit;
Instruction set requires when every instruction encoding, increases the predicate territory of N position, predicate territory concrete structure as shown in Figure 4, the predicate marker bit of the front M position in predicate territory is used for determining following two problems, 1, whether this statement be the predicate statement; If 2 predicate statements, this statement is carried out under which kind of predicate state, and in the present embodiment, take M=2 as example, its definition mode as shown in Figure 5.In the present invention, predicate is stored in predefined predicate register file, and predicate register file is one group of multiport register,, deposit corresponding predicate data.The residue N-M position in predicate territory is the predicate index bit, can index 2 N-M, can find the predicate corresponding with this instruction according to the index bit in the predicate territory of every instruction.Then predicate is divided into groups according to the characteristics of SIMD structure, for example, if the SIMD structure comprises four performance elements, can utilize simultaneously an instruction that four groups of data are processed, predicate is divided into four groups, whether each group is corresponding performance element respectively, utilize the comparative result of every group of predicate and marker bit to control this performance element and work, thereby realize whether steering order is carried out.
Step S32, decoding is carried out in described instruction, and obtained the value of marker bit and index bit;
Step S33, utilize the value of described marker bit to judge whether described instruction is the predicate instruction, if, execution in step S34, if not, execution in step S311;
According to definition mode shown in Figure 4, corresponding current marker bit, whether decision instruction is the predicate instruction.
Step S34, read the predicate in list item corresponding with described index place value in default predicate register file;
Search list item corresponding with it in predicate register file according to the value of index bit, read the predicate in list item, for example index bit is 111, and its value is 7, reads the predicate of the 7th the interior storage of list item in predicate register file.
Step S35, give described multi-group data stream with described predicate uniform distribution;
Whether the value of step S36, more described marker bit and each group data stream are corresponding respectively predicate is identical, if, execution in step S37, if not, execution in step S39;
marker bit and predicate are compared according to preset rules, preset rules can according to circumstances set up on their own, but need to follow certain principle, namely when instruction is the predicate instruction, when the value of the corresponding predicate that indexes out in the marker bit in instruction predicate territory and predicate register file is identical, comparative result is TRUE, otherwise comparative result is FALSE, as shown in Figure 6, when only having the predicate value identical with the predicate marker bit, result is TRUE, and for non-predicate statement, be that predicate is labeled as at 01 o'clock, need to carry out according to normal sequence instruction, so no matter what its corresponding predicate is, its result is all TRUE.
Step S37, determine that comparative result is but that identical data stream is data streams;
If comparative result is identical, represent that this instruction can be performed its corresponding data stream of processing, meet executive condition,
But step S38, the described data streams of execution instruction process;
Step S39, determine that comparative result is but that different data stream is non-data streams;
If comparative result is different, illustrating does not need to carry out this data stream of instruction process.
But step S310, execution non-operation instruction are processed described non-data streams;
If comparative result is different, expression does not need to carry out this data stream of instruction process, can replace instruction to the processing of data stream with blank operation, perhaps introduces gated clock, with carrying out the performance element dormancy of this data stream, stops the processing to this data stream.
Step S311, all set of streams of execution instruction process.
Specifically disclose the type of utilizing the marker bit decision instruction in the present embodiment and utilize whether certain rule judgment instruction carry out step, be that different instructions utilizes dummy instruction to fill for comparative result, thereby do not go to process, saved power consumption.Can realize reducing the calculating of redundant branch instruction, effectively raise treatment effeciency, reduce energy consumption, can process efficiently irregular control stream and use, further enlarge the scope of application of SIMD structure.
The invention also discloses a kind of SIMD processor, its structure as shown in Figure 7, comprise: instruction choose unit 71,, decoding unit 72, judging unit 73, predicate register file 74, reading unit 75, allocation units 76, comparing unit 77, determining unit 78, performance element 79, wherein:
Instruction is chosen unit 71 and is used for choosing pending instruction; Decoding unit 72 is used for decoding is carried out in described instruction, and obtains the value of marker bit and index bit; Judging unit 73 is used for utilizing the value of described marker bit to judge whether described instruction is the predicate instruction; Predicate register file 74 is used for depositing predicate; Reading unit 75 is used for reading the predicate in list item corresponding with described index place value in default predicate register file when described instruction is the predicate instruction; Allocation units 76 are used for described predicate uniform distribution to described multi-group data stream; Whether the value that comparing unit 77 is used for the predicate that more described marker bit and each group data stream are corresponding respectively is identical; Determining unit 78 is used for determining that comparative result is but that identical data stream is data streams; But performance element 79 is used for carrying out the described data streams of instruction process.
The disclosed SIMD processor of the present embodiment has four performance elements, can process four group data streams simultaneously, and each execution unit controls whether carry out this instruction according to the comparative result of comparing unit.When this functional part does not need to carry out this instruction, can adopt direct filling dummy instruction to the mode of streamline, utilize dummy instruction to replace current instruction, data are not processed, perhaps introduce gated clock, with this functional part dormancy, stop the processing to instruction.
Determining unit also is used for determining that comparative result is but that different data stream is non-data streams, processes described non-data streams but performance element also is used for carrying out non-operation instruction.
As seen from the figure, the disclosed SIMD processor of the present embodiment has only added predicate register file and steering logic on hardware, and the microstructure that does not need to change original processor is simple in structure, is easy to realize.
In this instructions, each embodiment adopts the mode of going forward one by one to describe, and what each embodiment stressed is and the difference of other embodiment that between each embodiment, identical similar part is mutually referring to getting final product.For the disclosed device of embodiment, because it is corresponding with the disclosed method of embodiment, so description is fairly simple, relevant part partly illustrates referring to method and gets final product.
The professional can also further recognize, unit and the algorithm steps of each example of describing in conjunction with embodiment disclosed herein, can realize with electronic hardware, computer software or combination both, for the interchangeability of hardware and software clearly is described, composition and the step of each example described in general manner according to function in the above description.These functions are carried out with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.The professional and technical personnel can specifically should be used for realizing described function with distinct methods to each, but this realization should not thought and exceeds scope of the present invention.
The method of describing in conjunction with embodiment disclosed herein or the step of algorithm can directly use the software module of hardware, processor execution, and perhaps both combination is implemented.Software module can be placed in the storage medium of any other form known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
To the above-mentioned explanation of the disclosed embodiments, make this area professional and technical personnel can realize or use the present invention.Multiple modification to these embodiment will be apparent concerning those skilled in the art, and General Principle as defined herein can be in the situation that do not break away from the spirit or scope of the present invention, realization in other embodiments.Therefore, the present invention will can not be restricted to these embodiment shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims (6)

1. the data processing method of a single instruction multiple data stream organization SIMD, is characterized in that, comprising:
Choose qualified instruction process multi-group data stream, described instruction has the predicate territory that comprises marker bit and index bit;
Decoding is carried out in described instruction, and obtained the value of described marker bit and index bit;
Utilize the value of described marker bit to judge whether described instruction is the predicate instruction;
When described instruction is the predicate instruction, search list item corresponding with it in predicate register file according to the value of index bit, read the predicate in list item corresponding with described index place value in default predicate register file;
Give described multi-group data stream with described predicate uniform distribution;
The value of the predicate that more described marker bit is corresponding with each group data stream respectively;
Determine that comparative result is but that identical data stream is data streams;
But carry out the described data streams of instruction process.
2. method according to claim 1, is characterized in that, also comprises:
Determine that comparative result is but that different data stream is non-data streams;
But carry out non-operation instruction and process described non-data streams.
3. method according to claim 1, is characterized in that, also comprises:
For not simultaneously, stop the processing to its corresponding data stream when described comparative result.
4. the described method of any one according to claim 1-3, is characterized in that, when described instruction is non-predicate instruction, directly carries out described instruction process multi-group data stream.
5. a SIMD processor, is characterized in that, comprising:
The unit is chosen in instruction, is used for choosing qualified instruction process multi-group data stream, and described instruction has the predicate territory that comprises marker bit and index bit;
Decoding unit is used for decoding is carried out in described instruction, and obtains the value of described marker bit and index bit;
Judging unit is used for utilizing the value of described marker bit to judge whether described instruction is the predicate instruction;
Predicate register file is used for depositing predicate;
Reading unit is used for searching list item corresponding with it in predicate register file according to the value of index bit when described instruction is the predicate instruction, reads the predicate in list item corresponding with described index bit in default predicate register file;
Allocation units are used for described predicate uniform distribution to described multi-group data stream;
Comparing unit is for the value of the predicate that more described marker bit is corresponding with each group data stream respectively;
Determining unit, but be used for determining that comparative result is that identical data stream is data streams;
Performance element, but be used for carrying out the described data streams of instruction process.
6. SIMD processor according to claim 5, is characterized in that, but described determining unit also is used for determining that comparative result is that different data stream is non-data streams; But described performance element also is used for carrying out non-operation instruction processes described non-data streams.
CN 201010261763 2010-08-16 2010-08-16 Data processing method on single instruction multiple data (SIMD) structure and processor Active CN101930358B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010261763 CN101930358B (en) 2010-08-16 2010-08-16 Data processing method on single instruction multiple data (SIMD) structure and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010261763 CN101930358B (en) 2010-08-16 2010-08-16 Data processing method on single instruction multiple data (SIMD) structure and processor

Publications (2)

Publication Number Publication Date
CN101930358A CN101930358A (en) 2010-12-29
CN101930358B true CN101930358B (en) 2013-06-19

Family

ID=43369556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010261763 Active CN101930358B (en) 2010-08-16 2010-08-16 Data processing method on single instruction multiple data (SIMD) structure and processor

Country Status (1)

Country Link
CN (1) CN101930358B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545066B (en) * 2011-12-08 2021-01-15 甲骨文国际公司 Techniques for maintaining column vectors of relational data within volatile memory
US9830164B2 (en) * 2013-01-29 2017-11-28 Advanced Micro Devices, Inc. Hardware and software solutions to divergent branches in a parallel pipeline
US11113054B2 (en) 2013-09-10 2021-09-07 Oracle International Corporation Efficient hardware instructions for single instruction multiple data processors: fast fixed-length value compression
JP6329412B2 (en) * 2014-03-26 2018-05-23 株式会社メガチップス SIMD processor
CN104317555B (en) * 2014-10-15 2017-03-15 中国航天科技集团公司第九研究院第七七一研究所 The processing meanss and method for merging and writing revocation are write in SIMD processor
CN107491288B (en) * 2016-06-12 2020-05-08 合肥君正科技有限公司 Data processing method and device based on single instruction multiple data stream structure
US10783102B2 (en) 2016-10-11 2020-09-22 Oracle International Corporation Dynamically configurable high performance database-aware hash engine
US10725947B2 (en) 2016-11-29 2020-07-28 Oracle International Corporation Bit vector gather row count calculation and handling in direct memory access engine
CN109547418B (en) * 2018-10-31 2021-05-14 中国科学院计算机网络信息中心 Data transmission network system based on Software Defined Network (SDN)
CN111124491B (en) * 2019-12-12 2022-04-22 浪潮(北京)电子信息产业有限公司 Batch processing method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080016320A1 (en) * 2006-06-27 2008-01-17 Amitabh Menon Vector Predicates for Sub-Word Parallel Operations
US20080114975A1 (en) * 2006-11-10 2008-05-15 Hsueh-Bing Yen Method and processing system for nested flow control utilizing predicate register and branch register
CN101256547A (en) * 2007-03-01 2008-09-03 矽统科技股份有限公司 Method for controlling nest-shaped process flow and processing system
CN101373427A (en) * 2007-08-24 2009-02-25 松下电器产业株式会社 Program execution control device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080016320A1 (en) * 2006-06-27 2008-01-17 Amitabh Menon Vector Predicates for Sub-Word Parallel Operations
US20080114975A1 (en) * 2006-11-10 2008-05-15 Hsueh-Bing Yen Method and processing system for nested flow control utilizing predicate register and branch register
CN101256547A (en) * 2007-03-01 2008-09-03 矽统科技股份有限公司 Method for controlling nest-shaped process flow and processing system
CN101373427A (en) * 2007-08-24 2009-02-25 松下电器产业株式会社 Program execution control device
US20090055635A1 (en) * 2007-08-24 2009-02-26 Matsushita Electric Industrial Co., Ltd. Program execution control device

Also Published As

Publication number Publication date
CN101930358A (en) 2010-12-29

Similar Documents

Publication Publication Date Title
CN101930358B (en) Data processing method on single instruction multiple data (SIMD) structure and processor
CN102750133B (en) 32-Bit triple-emission digital signal processor supporting SIMD
CN109144573A (en) Two-level pipeline framework based on RISC-V instruction set
CN100462922C (en) Binary translation method using intermediate command set
CN104679480A (en) Instruction set transition system and method
CN102298514A (en) Register mapping techniques for efficient dynamic binary translation
KR101772299B1 (en) Instruction to reduce elements in a vector register with strided access pattern
KR20130114688A (en) Architecture optimizer
US11126690B2 (en) Machine learning architecture support for block sparsity
US9164570B2 (en) Dynamic re-configuration for low power in a data processor
US9329666B2 (en) Power throttling queue
CN109871951A (en) A kind of deep learning processor and electronic equipment
CN103176914B (en) The caching method of a kind of low miss rate, low disappearance punishment and device
CN103984530A (en) Assembly line structure and method for improving execution efficiency of store command
CN104008021A (en) Precision exception signaling for multiple data architecture
Kim et al. PCM: precision-controlled memory system for energy efficient deep neural network training
CN101211256A (en) Special-purpose double production line RISC instruction system and its operation method
CN100451951C (en) 5+3 levels pipeline structure and method in RISC CPU
Xu et al. Optimizing finite volume method solvers on Nvidia GPUs
Montambault Trudelle The Public Investment Fund and Salman’s state: the political drivers of sovereign wealth management in Saudi Arabia
CN101930355A (en) Register circuit realizing grouping addressing and read write control method for register files
CN105988775A (en) Processor, program code translator and software
Wang et al. An automatic-addressing architecture with fully serialized access in racetrack memory for energy-efficient CNNs
CN110069243A (en) A kind of java program threads optimization method
CN202720631U (en) Single/double transmission instruction set-based microprocessor instruction processing system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant