TWI721999B - Vector length querying instruction - Google Patents

Vector length querying instruction Download PDF

Info

Publication number
TWI721999B
TWI721999B TW105122825A TW105122825A TWI721999B TW I721999 B TWI721999 B TW I721999B TW 105122825 A TW105122825 A TW 105122825A TW 105122825 A TW105122825 A TW 105122825A TW I721999 B TWI721999 B TW I721999B
Authority
TW
Taiwan
Prior art keywords
vector
value
proportional
length
vector length
Prior art date
Application number
TW105122825A
Other languages
Chinese (zh)
Other versions
TW201717051A (en
Inventor
奈吉爾約翰 史蒂芬斯
葛利格瑞斯 馬科里斯
亞歷賈卓 馬丁維森特
納森奈爾 普瑞米里耶
Original Assignee
英商Arm股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 英商Arm股份有限公司 filed Critical 英商Arm股份有限公司
Publication of TW201717051A publication Critical patent/TW201717051A/en
Application granted granted Critical
Publication of TWI721999B publication Critical patent/TWI721999B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • G06F9/30149Instruction analysis, e.g. decoding, instruction word fields of variable length instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Complex Calculations (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

A data processing system 2 supporting vector processing operations uses scaling vector length querying instructions. The scaling vector length querying instructions return a result which is dependent upon a number of elements in a vector for a variable vector element size specified by the instruction and multiplied by a scaling value specified by the instruction. The scaling vector length querying instructions may be in the form of count instructions, increment instructions or decrement instructions. The instructions may include a pattern constraint applying a constraint, such as modulo(M) or power of 2 to the partial result value representing the number of vector elements provided for the register element size specified for the instruction.

Description

向量長度查詢指令Vector length query command

本揭示案係關於資料處理系統領域。更特定而言,本揭示案係關於支援向量處理的資料處理系統。This disclosure case is related to the field of data processing systems. More specifically, the present disclosure relates to a data processing system supporting vector processing.

已知提供支援處理向量運算元的資料處理系統,該等向量運算元包括複數個向量元素。向量暫存器內的位元數目通常由相關處理器架構定義。向量暫存器中之位元數目可在不同大小的向量元素之間劃分,從而使得給定向量暫存器提供不同數目之向量元素。It is known to provide a data processing system that supports the processing of vector operands, the vector operands including a plurality of vector elements. The number of bits in the vector register is usually defined by the relevant processor architecture. The number of bits in the vector register can be divided among vector elements of different sizes, so that a given vector register provides a different number of vector elements.

根據本揭示案之至少一些實施例,提供用於處理資料的設備,該設備包括處理電路系統以執行向量處理操作;及解碼器電路系統以解碼程式指令來產生控制信號,以控制該處理電路系統來執行該等向量處理操作;其中該解碼器電路系統回應於比例向量長度查詢指令以控制該處理電路系統,以返回取決於向量長度乘以比例值而定的結果值,該設備在執行該向量處理操作時使用該向量長度,且該比例值由該比例向量長度查詢指令規定。According to at least some embodiments of the present disclosure, a device for processing data is provided. The device includes a processing circuit system to perform vector processing operations; and a decoder circuit system to decode program instructions to generate control signals to control the processing circuit system To perform the vector processing operations; wherein the decoder circuit system responds to the scale vector length query command to control the processing circuit system to return a result value that depends on the vector length multiplied by the scale value, and the device is executing the vector The vector length is used in processing operations, and the scale value is specified by the scale vector length query instruction.

根據本揭示案之至少一些實施例,提供用於處理資料的設備,該設備包括用於執行向量處理操作的處理手段;及解碼器手段以用於解碼程式指令來產生控制信號,以控制該處理電路系統以執行該等向量處理操作;其中該解碼器手段回應於比例向量長度查詢指令以控制該處理手段,以返回取決於向量長度乘以比例值而定的結果值,該設備在執行該向量處理操作時使用該向量長度,且該比例值由該比例向量長度查詢指令規定。According to at least some embodiments of the present disclosure, a device for processing data is provided. The device includes processing means for performing vector processing operations; and decoder means for decoding program instructions to generate control signals to control the processing The circuit system performs the vector processing operations; wherein the decoder means responds to the scale vector length query command to control the processing means to return a result value determined by the vector length multiplied by the scale value, and the device is executing the vector The vector length is used in processing operations, and the scale value is specified by the scale vector length query instruction.

根據本揭示案之至少一些實施例,提供處理資料的方法,該方法包括解碼比例向量長度查詢指令以控制處理電路系統,以返回取決於向量長度乘以比例值的結果值,該向量長度在執行向量處理操作時使用,且該比例值由該比例向量長度查詢指令規定。According to at least some embodiments of the present disclosure, a method for processing data is provided. The method includes decoding a scale vector length query command to control a processing circuit system to return a result value that depends on the vector length multiplied by the scale value. It is used in vector processing operations, and the scale value is specified by the scale vector length query instruction.

本揭示案之上述及其他目標、特徵,及優勢將在說明性實施例之以下詳細說明中顯而易見,該詳細說明將結合附圖閱讀。The above and other objectives, features, and advantages of the present disclosure will be apparent in the following detailed description of the illustrative embodiments, which will be read in conjunction with the accompanying drawings.

第1圖示意地圖示資料處理系統2,該資料處理系統包括耦接至記憶體6的處理器4,該記憶體6儲存資料值8及程式指令10。處理器4包括指令擷取單元12以用於從記憶體6擷取程式指令10並向解碼器電路系統14供應擷取程式指令。解碼器電路系統14解碼所擷取的程式指令且產生控制信號16以控制向量處理電路系統18,以對儲存在向量暫存器電路系統20內的向量暫存器執行向量處理操作,該操作如解碼的向量指令所規定。將瞭解,實際上,處理器4將通常包含更多電路元件,且該等圖式中已省略該等電路元件。FIG. 1 schematically illustrates a data processing system 2 which includes a processor 4 coupled to a memory 6 which stores data values 8 and program instructions 10. The processor 4 includes an instruction fetching unit 12 for fetching program instructions 10 from the memory 6 and supplying the fetching program instructions to the decoder circuit system 14. The decoder circuitry 14 decodes the retrieved program instructions and generates a control signal 16 to control the vector processing circuitry 18 to perform vector processing operations on the vector registers stored in the vector register circuitry 20, such as Specified by the decoded vector instruction. It will be understood that, in practice, the processor 4 will usually include more circuit elements, and these circuit elements have been omitted from the drawings.

第1圖亦示意地圖示示例性向量暫存器Zi ,在此實例中,該向量暫存器具有256之向量位元大小。向量暫存器Zi 由16個向量元素a0 -a15 形成。該等向量元素中之每一者具有16位元之向量位元大小,因此,向量暫存器Zi 內的向量元素數目是16。向量元素大小是可由被解碼的向量指令規定之變數。例如,向量指令可經編碼以規定向量元素為位元組、半字、字或雙字(分別為8、16、32、64位元)。依據向量元素位元大小,向量暫存器Zi 內的向量元素數目將改變。由此,對於由256位元形成的向量暫存器Zi 而言,此可支援十六個16位元向量元素、八個32位元向量元素或四個64位元向量元素。Figure 1 also schematically illustrates an exemplary vector register Z i . In this example, the vector register has a vector bit size of 256. The vector register Z i is formed by 16 vector elements a 0 -a 15 . Each of these vector elements in the bit vector having a size of 16 bits, therefore, the number of vector elements in the vector register 16 is Z i. The vector element size is a variable that can be specified by the decoded vector instruction. For example, vector instructions can be encoded to specify vector elements as bytes, halfwords, words, or doublewords (8, 16, 32, 64 bits, respectively). Based on bit vector element size, the number of vector elements in the vector register Z i will change. Accordingly, for the vector register formed by Z i 256 yuan, which may support sixteen vector elements 16 yuan, 32 yuan eight vector elements or four elements of the vector 64 yuan.

處理器4的給定實施方式將包含向量暫存器電路系統20,該電路系統支援給定位元大小的向量暫存器Zi。然而,使用同一指令集架構的處理器4的不同實施方式可支援不同大小之向量暫存器,例如512位元、384位元、1024位元,等等。A given implementation of the processor 4 will include a vector register circuit system 20 that supports a vector register Zi of the size of the location element. However, different implementations of the processor 4 using the same instruction set architecture can support vector registers of different sizes, such as 512-bit, 384-bit, 1024-bit, and so on.

作為賦能包括向量指令的程式碼以適應提供不同大小的向量暫存器Zi 而無需任何或顯著修改的向量暫存器電路系統20的方式,提供有一或更多個比例向量長度查詢指令。該等比例向量長度查詢指令返回取決於向量中元素數目的結果值,以獲得可變的向量元素大小(例如位元組、半字、字或雙字),該向量元素大小由比例向量長度查詢指令規定並乘以由比例向量長度查詢指令規定之比例值。該種比例向量長度查詢指令能夠返回一結果,該結果可慮及實施方式之特定向量暫存器大小,且亦慮及可由正在實施的向量碼提供的程式迴路滾動的程度。比例值可以恆定整數值的形式提供,該恆定整數值編碼在比例向量長度查詢指令內(例如指令內的立即值(immediate value))。As the enabling code comprises a vector instruction to accommodate varying sizes vector Z i register without any significant manner or modifying the vector register circuitry 20, there is provided a vector length ratio or more query instruction. The proportional vector length query command returns a result value that depends on the number of elements in the vector to obtain a variable vector element size (for example, byte, halfword, word, or double word). The vector element size is queried by the proportional vector length The instruction specifies and multiplies the scale value specified by the scale vector length query instruction. This kind of proportional vector length query command can return a result, which can take into account the specific vector register size of the implementation, and also take into account the degree of program loop scrolling provided by the vector code being implemented. The scale value may be provided in the form of a constant integer value, which is encoded in the scale vector length query instruction (for example, an immediate value in the instruction).

第2圖示意地圖示多種不同形式之比例向量長度查詢指令。特定而言,向量長度查詢指令的類型可為計數指令、遞增指令或遞減指令。作為比例向量長度查詢指令的更多類型之指令亦有可能。比例向量長度查詢指令額外規定向量元素大小,將決定相對於該大小的結果值。因此,向量元素大小可為位元組B、半字H、字W或雙字D。比例向量長度查詢指令亦規定純量暫存器Xd ,該純量暫存器可充當對於指令的輸入運算元值的來源,並充當待寫入的結果值目的地。Figure 2 schematically illustrates various types of proportional vector length query commands. Specifically, the type of the vector length query instruction can be a count instruction, an increment instruction, or a decrement instruction. More types of instructions as the length of the proportional vector query instructions are also possible. The proportional vector length query instruction additionally specifies the size of the vector element, which will determine the result value relative to this size. Therefore, the vector element size can be byte B, half word H, word W, or double word D. The proportional vector length query instruction also specifies a scalar register X d , which can serve as the source of the input operand value for the instruction and the destination of the result value to be written.

第2圖中圖示的比例向量長度查詢指令內最終兩個參數是一欄位,該欄位規定模式約束,該模式約束應用於由處理設備所使用的向量長度,且可規定依據該模式約束返回的比例向量長度查詢指令之結果值。向量模式約束可採用多種不同形式。約束可為例如設備提供的最大值是規定值M的倍數,例如向量元素數目的值被約束為值M的倍數(模數(M))。約束模式的另一實例是針對支援的向量元素數目所返回的最大值應約束為2的乘冪,例如2、4、8、16、32,等等。約束模式的另一實例是超出向量暫存器Zi 的實體大小,不對支援的元素數目最大值應用約束。因此,如若向量暫存器長度是256位元,且元素大小是16位元,則以此方式,利用「All」約束,將基於向量元素數目為16返回結果值。The final two parameters in the proportional vector length query command shown in Figure 2 are a field that specifies the mode constraint, which is applied to the vector length used by the processing device, and can be specified based on the mode constraint The returned scale vector length query command result value. Vector mode constraints can take many different forms. The constraint may be, for example, that the maximum value provided by the device is a multiple of the prescribed value M, for example, the value of the number of vector elements is constrained to be a multiple of the value M (modulus (M)). Another example of the constraint mode is that the maximum value returned for the number of vector elements supported should be constrained to a power of two, such as 2, 4, 8, 16, 32, and so on. Another example is a constraint patterns exceeds the physical size of the vector Z i of the register, the maximum number of elements does not support the utilization constraint. Therefore, if the length of the vector register is 256 bits and the element size is 16 bits, in this way, using the "All" constraint, the result value will be returned based on the number of vector elements being 16.

由比例向量長度查詢指令規定的最終參數是比例值。此值可為自身具有立即值的比例向量長度查詢指令內編碼的恆定整數值。儘管此比例值可能具有多種值,已發現1至8(包含1及8)範圍中之比例值能夠支援大部分程度之迴路展開,該迴路展開通常在保存用以編碼比例向量長度查詢指令的指令位元空間時被發現。The final parameter specified by the proportional vector length query command is the proportional value. This value can be a constant integer value encoded in the scale vector length query instruction with its own immediate value. Although this scale value may have many values, it has been found that the scale value in the range of 1 to 8 (including 1 and 8) can support most of the loop expansion. The loop expansion is usually stored in the command to encode the length of the scale vector query command It was discovered when the bit space.

第3圖示意地圖示結果值,該等結果值可被返回以用於某些示例性比例向量長度查詢指令。圖示之列是向量位元大小(亦即向量暫存器Zi 的位元大小)、向量元素位元大小(例如位元組、半字、字、雙字)、向量模式約束(例如A、模數(M)、2的乘冪,等等)、比例值(例如指令自身內編碼的恆定值,範圍自1至8(包含1及8)),及指令類型(例如計數、遞增、遞減)。相對於計數指令,輸入值可被視為「0」,且可為輸入暫存器Xd 在遞增及遞減指令的情況下所提供的值。Figure 3 schematically illustrates the result values, which can be returned for use in some exemplary scale vector length query commands. The column illustrated is a vector bit size (i.e. the size in bits of the vector register Z i), bit size vector elements (e.g., byte, halfword, word, double word), the vector mode confinement (e.g. A , Modulus (M), powers of 2, etc.), proportional values (such as a constant value encoded in the instruction itself, ranging from 1 to 8 (including 1 and 8)), and instruction types (such as counting, incrementing, Decrease). Compared with the counting instruction, the input value can be regarded as "0", and can be the value provided by the input register X d in the case of increment and decrement instructions.

考慮第3圖中圖示的第一行,此行規定向量位元大小128。向量元素位元大小為8。因此,無約束的最大向量元素計數為16。此行的模式約束是「All」,且相應地此對應於無約束模式。第一行的比例是「1」,且因此結果不按比例變更。指令類型是計數,且因此,提供向量暫存器內具有位元組大小的向量元素的數目簡單計數且該計數等於16。Consider the first line illustrated in Figure 3, which specifies a vector bit size of 128. The vector element bit size is 8. Therefore, the unconstrained maximum vector element count is 16. The mode constraint for this row is "All", and accordingly this corresponds to the unconstrained mode. The scale in the first row is "1", and therefore the result is not changed proportionally. The instruction type is count, and therefore, a simple count of the number of byte-sized vector elements in the vector register is provided and the count is equal to 16.

更複雜的實例在第五行給定。在此情況中,向量位元大小是256,且向量元素位元大小是32。此指示所支援的向量元素的無約束數目為8。然而,模式約束是應為支援數目的向量元素數目應為3的倍數。因此,模式約束將被視作所支援最大值的向量元素數目減少至6。第五行的比例值是2,且指令類型是計數,且由此,結果值是模式約束的結果的兩倍,亦即12。More complex examples are given in the fifth line. In this case, the vector bit size is 256, and the vector element bit size is 32. This indicates that the unconstrained number of vector elements supported is 8. However, the mode constraint is that the number of vector elements that should be the supported number should be a multiple of 3. Therefore, the mode constraint reduces the number of vector elements considered as the maximum supported value to 6. The scale value of the fifth row is 2, and the instruction type is count, and therefore, the result value is twice the result of the mode constraint, that is, 12.

又一實例在第十行中給定。在此行中,向量位元大小是384,且向量元素位元大小是16。此可能指示向量暫存器支援的原始向量元素數目為24。然而,此行中應用的模式約束是所支援的向量計數應為2的乘冪,且由此,所支援的位元大小為16的向量元素最大數目被視作16。應用比例因數2。因此,經縮放後的部分結果值是32。指令類型是遞增類型比例向量長度查詢指令,且保持在定標器暫存器Xd 內的對於遞增的輸入值是48。此產生結果值(遞增值)80。Another example is given in the tenth row. In this row, the vector bit size is 384, and the vector element bit size is 16. This may indicate that the number of original vector elements supported by the vector register is 24. However, the mode constraint applied in this line is that the supported vector count should be a power of two, and thus, the maximum number of supported vector elements with a bit size of 16 is regarded as 16. A scale factor of 2 is applied. Therefore, the partial result value after scaling is 32. The instruction type is an increment type proportional vector length query instruction, and the input value for increment held in the scaler register X d is 48. This produces the result value (incremental value) 80.

當此比例向量長度查詢指令是比例計數指令時,則此指令返回一計數值,此計數值取決於所支援的元素數目乘以比例值。當比例向量長度查詢指令是比例遞增指令時,則此指令返回一遞增結果值,該結果值取決於將遞增的輸入值(藉由取決於向量位元大小、向量元素位元大小、模式約束及比例值而決定的值而遞增)。以類似方式,當比例向量長度查詢指令是比例遞減指令時,則此指令返回一遞減結果值,該遞減結果值取決於待遞減的輸入值。When the proportional vector length query command is a proportional counting command, the command returns a count value, which depends on the number of elements supported multiplied by the proportional value. When the proportional vector length query command is a proportional increase command, the command returns an increment result value, which depends on the input value to be incremented (by depending on the vector bit size, vector element bit size, mode constraints and The value determined by the proportional value increases). In a similar manner, when the proportional vector length query command is a proportional decrease command, the command returns a decrease result value, and the decrease result value depends on the input value to be decreased.

第4圖圖示用於回應於比例向量長度查詢指令而決定結果值的電路系統的示例性實施方式。此實施方式具有查找表之形式,該表包括表位址解碼器22,該解碼器索引至結果值表24中。供應來自比例向量長度查詢指令的參數(欄位)作為對表解碼器22的輸入。該等欄位包括元素大小(2位元)、應用的約束模式(5位元),及應用的比例值(3位元)。表位址解碼器22針對特定實施方式,亦即針對實施的特定向量長度具有固定形式。表位址解碼器22回應於供應至其的輸入信號而產生1熱輸出,以從結果值表24中選擇一結果值,如若比例向量長度查詢指令分別是比例遞增指令或比例遞減指令,則此結果值隨後被供應至遞增或遞減電路系統(加法器或減法器)。FIG. 4 illustrates an exemplary embodiment of a circuit system for determining a result value in response to a proportional vector length query command. This embodiment has the form of a look-up table, the table includes a table address decoder 22, and the decoder indexes into the result value table 24. The parameters (fields) from the proportional vector length query command are supplied as input to the table decoder 22. These fields include the element size (2 bits), the applied constraint mode (5 bits), and the applied scale value (3 bits). The table address decoder 22 has a fixed form for a specific implementation, that is, a specific vector length for the implementation. The table address decoder 22 generates 1 heat output in response to the input signal supplied to it to select a result value from the result value table 24. If the proportional vector length query command is a proportional increase command or a proportional decrease command, The resulting value is then supplied to the increment or decrement circuitry (adder or subtractor).

第5圖是一流程圖,該圖示意地圖示處理比例向量長度查詢指令的邏輯流程。將理解,實際上,該種處理通常同時執行,且第5圖中圖示的個別步驟實際上可同時執行。在步驟26中,處理等待直至收到比例向量長度查詢指令。然後,步驟28決定與規定向量元素大小相關的向量暫存器實施方式所支援的向量元素的原始最大數目。步驟30應用比例向量長度查詢指令所規定的約束模式(如有)。步驟32對藉由在步驟26中應用約束模式所決定的部分結果進行比例縮放。如若指令類型是計數指令,然後步驟36將計數結果寫入目的地暫存器。如若指令類型是遞增,則在步驟38中,使藉由在步驟32中決定的遞增值已保持的目的地暫存器值遞增,且將遞增結果寫入目的地暫存器。如若在步驟34中決定指令類型是遞減,則在步驟40中,藉由在步驟32中計算的遞減值而使目的地暫存器值遞減,且將結果寫入目的地暫存器。Figure 5 is a flow chart schematically illustrating the logic flow of processing a proportional vector length query command. It will be understood that, in practice, this type of processing is usually executed at the same time, and the individual steps illustrated in Figure 5 may actually be executed at the same time. In step 26, the process waits until the proportional vector length query instruction is received. Then, step 28 determines the original maximum number of vector elements supported by the vector register implementation in relation to the specified vector element size. Step 30 applies the constraint mode (if any) specified by the proportional vector length query command. Step 32 scales the partial results determined by applying the constraint mode in step 26. If the instruction type is a count instruction, then step 36 writes the count result into the destination register. If the command type is increment, in step 38, the destination register value that has been maintained by the increment value determined in step 32 is incremented, and the increment result is written into the destination register. If it is determined in step 34 that the command type is decrement, then in step 40, the destination register value is decremented by the decrement value calculated in step 32, and the result is written into the destination register.

第6圖圖示可使用的虛擬機實施方式。儘管前文所述實施例根據用於操作特定處理硬體的設備及方法來實施本發明,該硬體支援本案相關技術,但亦有可能提供硬體裝置的所謂虛擬機的實施方式。該等虛擬機實施方式在主機處理機530上執行,該主機處理機530在支援虛擬機程式510的主機作業系統520上執行。通常,需要強大的處理器來提供虛擬機實施方式,該虛擬機實施方式以合理的速度執行,但該種方法在某些環境中合乎情理,如在出於相容性或再使用原因而需要執行源於另一處理器的碼的情況下。虛擬機程式510向應用程式500提供應用程式介面,該介面與將由實際硬體提供的應用程式介面相同,該實際硬體是正由虛擬機程式510模型化的裝置。由此,包括對上述記憶體存取的控制的程式指令可藉由使用虛擬機程式510而自應用程式500內執行,以對該等程式指令與虛擬機硬體的互動進行模型化。Figure 6 illustrates a virtual machine implementation that can be used. Although the foregoing embodiment implements the present invention based on equipment and methods for operating specific processing hardware, which supports related technologies in this case, it is also possible to provide implementations of so-called virtual machines of hardware devices. These virtual machine implementations are executed on a host processor 530, which is executed on a host operating system 520 that supports a virtual machine program 510. Generally, a powerful processor is required to provide a virtual machine implementation that executes at a reasonable speed, but this method makes sense in some environments, such as when it is required for compatibility or reuse reasons When executing code originating from another processor. The virtual machine program 510 provides an application program interface to the application program 500, which is the same as the application program interface to be provided by actual hardware, which is a device being modeled by the virtual machine program 510. Thus, the program instructions including the control of the above-mentioned memory access can be executed from the application program 500 by using the virtual machine program 510 to model the interaction between the program instructions and the virtual machine hardware.

儘管本文已描述特定實施例,但將瞭解,本發明並非限定於彼等實施例,且可在本發明之範疇內進行諸多修改及添加。例如,可在不背離本發明之範疇的前提下,以下附屬請求項之特徵可與獨立請求項之特徵組成進行多種組合。Although specific embodiments have been described herein, it will be understood that the present invention is not limited to these embodiments, and many modifications and additions can be made within the scope of the present invention. For example, without departing from the scope of the present invention, the features of the following subsidiary claims can be combined with the features of independent claims.

2‧‧‧資料處理系統 4‧‧‧處理器 6‧‧‧記憶體 8‧‧‧資料值 10‧‧‧程式指令 12‧‧‧指令擷取單元 14‧‧‧解碼器電路系統 16‧‧‧控制信號 18‧‧‧向量處理電路系統 20‧‧‧向量暫存器電路系統 22‧‧‧表位址解碼器 24‧‧‧結果值表 26‧‧‧步驟 28‧‧‧步驟 30‧‧‧步驟 32‧‧‧步驟 34‧‧‧步驟 36‧‧‧步驟 38‧‧‧步驟 40‧‧‧步驟 500‧‧‧應用程式 510‧‧‧虛擬機程式 520‧‧‧主機作業系統 530‧‧‧主機處理機2‧‧‧Data Processing System 4‧‧‧Processor 6‧‧‧Memory 8‧‧‧Data value 10‧‧‧Program command 12‧‧‧Command fetching unit 14‧‧‧Decoder circuit system 16‧‧‧Control signal 18‧‧‧Vector processing circuit system 20‧‧‧Vector register circuit system 22‧‧‧Table Address Decoder 24‧‧‧Result value table 26‧‧‧Step 28‧‧‧Step 30‧‧‧Step 32‧‧‧Step 34‧‧‧Step 36‧‧‧Step 38‧‧‧Step 40‧‧‧Step 500‧‧‧Application 510‧‧‧Virtual Machine Program 520‧‧‧Host Operating System 530‧‧‧Host processor

第1圖示意地圖示支援向量處理的資料處理系統;Figure 1 schematically illustrates a data processing system supporting vector processing;

第2圖示意地圖示複數個不同形式之比例向量長度查詢指令;Figure 2 schematically illustrates a plurality of different forms of proportional vector length query commands;

第3圖示意地圖示第2圖中不同類型之比例向量長度查詢指令之特性的實例;Figure 3 schematically illustrates examples of the characteristics of different types of proportional vector length query commands in Figure 2;

第4圖示意地圖示查找表實施方式,用於回應於比例向量長度查詢指令而產生結果值;Figure 4 schematically illustrates an implementation of a lookup table for generating a result value in response to a proportional vector length query command;

第5圖是一流程圖,該圖示意地圖示比例向量長度查詢指令之特性;及Figure 5 is a flow chart which schematically illustrates the characteristics of the proportional vector length query command; and

第6圖示意地圖示虛擬機實施方式。Figure 6 schematically illustrates a virtual machine implementation.

國內寄存資訊 (請依寄存機構、日期、號碼順序註記) 無Domestic hosting information (please note in the order of hosting organization, date, and number) None

國外寄存資訊 (請依寄存國家、機構、日期、號碼順序註記) 無Foreign hosting information (please note in the order of hosting country, institution, date, and number) None

(請換頁單獨記載) 無(Please change the page to record separately) None

26‧‧‧步驟 26‧‧‧Step

28‧‧‧步驟 28‧‧‧Step

30‧‧‧步驟 30‧‧‧Step

32‧‧‧步驟 32‧‧‧Step

34‧‧‧步驟 34‧‧‧Step

36‧‧‧步驟 36‧‧‧Step

38‧‧‧步驟 38‧‧‧Step

40‧‧‧步驟 40‧‧‧Step

Claims (15)

一種用於處理資料之設備,該設備包括:處理電路系統,用以執行向量處理操作;及解碼器電路系統,用以解碼程式指令來產生控制信號以控制該處理電路系統,以執行該等向量處理操作;其中該解碼器電路系統可回應於規定一向量元素大小和一比例值的一比例向量長度查詢指令以控制該處理電路系統,以返回一結果值,該結果值取決於在一預定向量長度下的該向量元素大小的一向量元素數目乘以該比例值,該預定向量長度表示由該設備使用的一向量暫存器的一長度。 A device for processing data, the device comprising: a processing circuit system for performing vector processing operations; and a decoder circuit system for decoding program instructions to generate control signals to control the processing circuit system to execute the vectors Processing operation; wherein the decoder circuit system can respond to a ratio vector length query command specifying a vector element size and a ratio value to control the processing circuit system to return a result value, the result value depends on a predetermined vector The number of vector elements of the vector element size under the length is multiplied by the ratio value, and the predetermined vector length represents a length of a vector register used by the device. 如請求項1所述之設備,其中該比例值是該比例向量長度查詢指令內編碼的一恆定整數值。 The device according to claim 1, wherein the scale value is a constant integer value encoded in the scale vector length query instruction. 如請求項1所述之設備,其中該比例值處於自1延伸至8的一範圍中,該範圍包括1及8。 The device according to claim 1, wherein the ratio value is in a range extending from 1 to 8, and the range includes 1 and 8. 如請求項1所述之設備,其中該向量元素大小選自8位元、16位元、32位元及64位元中之一者。 The device according to claim 1, wherein the size of the vector element is selected from one of 8-bit, 16-bit, 32-bit, and 64-bit. 如請求項1所述之設備,其中該比例向量長度查詢指令包括一或更多個其他參數,且由該處理電路系統返回的該結果值取決於該一或更多個其他參 數。 The device according to claim 1, wherein the proportional vector length query instruction includes one or more other parameters, and the result value returned by the processing circuit system depends on the one or more other parameters number. 如請求項5所述之設備,其中該一或更多個其他參數包括一向量模式約束,在該向量模式約束的限制下決定由該設備使用的該向量長度。 The device according to claim 5, wherein the one or more other parameters include a vector mode constraint, and the vector length used by the device is determined under the restriction of the vector mode constraint. 如請求項6所述之設備,其中該向量模式約束規定該元素數目是以下各者中之一者:由該設備提供的一最大值,該最大值亦是一規定值M的一倍數;由該設備提供的一最大值,該最大值亦是2的一乘冪;及由該設備提供的一最大值。 The device according to claim 6, wherein the vector mode constraint stipulates that the number of elements is one of the following: a maximum value provided by the device, and the maximum value is also a multiple of a prescribed value M; A maximum value provided by the device, the maximum value is also a power of 2; and a maximum value provided by the device. 如請求項1所述之設備,其中該比例向量長度查詢指令是一比例計數指令,該比例計數指令返回一計數結果值,該結果值取決於該元素數目乘以該比例值。 The device according to claim 1, wherein the proportional vector length query command is a proportional counting command, and the proportional counting command returns a counting result value, and the result value depends on the number of elements multiplied by the proportional value. 如請求項1所述之設備,其中該比例向量長度查詢指令是一比例遞增指令,該比例遞增指令返回一遞增結果值,該遞增結果值取決於將遞增的一輸入值。 The device according to claim 1, wherein the proportional vector length query command is a proportional increase command, the proportional increase command returns an increase result value, and the increase result value depends on an input value to be increased. 如請求項1所述之設備,其中該比例向量長度查詢指令是一比例遞減指令,該比例遞減指令返回一遞減結果值,該遞減結果值取決於將遞減的一輸 入值。 The device according to claim 1, wherein the proportional vector length query command is a proportional decrement command, and the proportional decrement command returns a decrement result value, and the decrement result value depends on an input to be decremented Into the value. 如請求項1所述之設備,其中該向量處理電路系統包括一查找表,該查找表取決於該比例值而定址,以至少部分地決定該結果值。 The device according to claim 1, wherein the vector processing circuit system includes a look-up table, the look-up table is addressed depending on the ratio value to at least partially determine the result value. 如請求項5所述之設備,其中該查找表取決於該一或更多個其他參數而定址。 The device according to claim 5, wherein the look-up table is addressed depending on the one or more other parameters. 一種用於處理資料之設備,該設備包括:處理手段,用於執行向量處理操作;及解碼手段,用以解碼程式指令來產生控制信號以控制該處理手段,以執行該向量處理操作;其中該解碼器手段可回應於規定一向量元素大小和一比例值的一比例向量長度查詢指令以控制該處理手段,以返回一結果值,該結果值取決於在一預定向量長度下的該向量元素大小的一向量元素數目乘以該比例值,該預定向量長度表示由該設備使用的一向量暫存器的一長度。 A device for processing data, the device comprising: processing means for performing vector processing operations; and decoding means for decoding program instructions to generate control signals to control the processing means to perform the vector processing operations; wherein the The decoder means can respond to a scale vector length query command specifying a vector element size and a scale value to control the processing means to return a result value that depends on the vector element size under a predetermined vector length The number of vector elements of is multiplied by the ratio, and the predetermined vector length represents a length of a vector register used by the device. 一種處理資料的方法,該方法包括以下步驟:解碼規定一向量元素大小和一比例值的一比例向量長度查詢指令,以控制處理電路系統,以返回一結果值,該結果值取決於在一預定向量長度下的該向量元素大小的一向量元素數目乘以該比例值,該預定向 量長度表示由一設備使用的一向量暫存器的一長度。 A method for processing data. The method includes the following steps: decoding a scale vector length query command specifying a vector element size and a scale value to control the processing circuit system to return a result value, which depends on a predetermined value The number of vector elements of the vector element size under the vector length is multiplied by the ratio value, and the predetermined direction The quantity length represents a length of a vector register used by a device. 一種電腦程式,儲存在一非暫時性儲存媒體上以用於控制一電腦以提供一虛擬機執行環境,該虛擬機執行環境對應於如請求項1所述之設備。 A computer program stored on a non-transitory storage medium for controlling a computer to provide a virtual machine execution environment, the virtual machine execution environment corresponding to the device as described in claim 1.
TW105122825A 2015-07-31 2016-07-20 Vector length querying instruction TWI721999B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP15386026.7A EP3125109B1 (en) 2015-07-31 2015-07-31 Vector length querying instruction
EP15386026.7 2015-07-31

Publications (2)

Publication Number Publication Date
TW201717051A TW201717051A (en) 2017-05-16
TWI721999B true TWI721999B (en) 2021-03-21

Family

ID=54140382

Family Applications (1)

Application Number Title Priority Date Filing Date
TW105122825A TWI721999B (en) 2015-07-31 2016-07-20 Vector length querying instruction

Country Status (8)

Country Link
US (1) US11314514B2 (en)
EP (1) EP3125109B1 (en)
JP (1) JP6818010B2 (en)
KR (1) KR102586258B1 (en)
CN (1) CN107851022B (en)
IL (1) IL256403B (en)
TW (1) TWI721999B (en)
WO (1) WO2017021055A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170046168A1 (en) * 2015-08-14 2017-02-16 Qualcomm Incorporated Scalable single-instruction-multiple-data instructions
US11455169B2 (en) 2019-05-27 2022-09-27 Texas Instruments Incorporated Look-up table read
CN110333857B (en) * 2019-07-12 2023-03-14 辽宁工程技术大学 Automatic user-defined instruction identification method based on constraint programming

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083525A (en) * 2005-12-30 2007-12-05 英特尔公司 Cryptography processing units and multiplier
TW201020805A (en) * 2008-10-08 2010-06-01 Advanced Risc Mach Ltd Apparatus and method for performing SIMD multiply-accumulate operations
US7917302B2 (en) * 2000-09-28 2011-03-29 Torbjorn Rognes Determination of optimal local sequence alignment similarity score
US20140207838A1 (en) * 2011-12-22 2014-07-24 Klaus Danne Method, apparatus and system for execution of a vector calculation instruction
US20140289502A1 (en) * 2013-03-19 2014-09-25 Apple Inc. Enhanced vector true/false predicate-generating instructions
TW201439902A (en) * 2012-12-27 2014-10-16 Nvidia Corp Fault detection in instruction translations

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6133547A (en) * 1984-07-25 1986-02-17 Fujitsu Ltd Method for informing overflow information of vector register
US4745547A (en) * 1985-06-17 1988-05-17 International Business Machines Corp. Vector processing
US5537606A (en) * 1995-01-31 1996-07-16 International Business Machines Corporation Scalar pipeline replication for parallel vector element processing
US6788303B2 (en) * 2001-02-27 2004-09-07 3Dlabs Inc., Ltd Vector instruction set
US20040073773A1 (en) * 2002-02-06 2004-04-15 Victor Demjanenko Vector processor architecture and methods performed therein
US9170812B2 (en) * 2002-03-21 2015-10-27 Pact Xpp Technologies Ag Data processing system having integrated pipelined array data processor
US8966223B2 (en) * 2005-05-05 2015-02-24 Icera, Inc. Apparatus and method for configurable processing
CN101535945A (en) * 2006-04-25 2009-09-16 英孚威尔公司 Full text query and search systems and method of use
US8555034B2 (en) * 2009-12-15 2013-10-08 Oracle America, Inc. Execution of variable width vector processing instructions
US10175990B2 (en) * 2009-12-22 2019-01-08 Intel Corporation Gathering and scattering multiple data elements
US20110158310A1 (en) * 2009-12-30 2011-06-30 Nvidia Corporation Decoding data using lookup tables
CN101901248B (en) 2010-04-07 2012-08-15 北京星网锐捷网络技术有限公司 Method and device for creating and updating Bloom filter and searching elements
JP5699554B2 (en) * 2010-11-11 2015-04-15 富士通株式会社 Vector processing circuit, instruction issue control method, and processor system
US9092227B2 (en) * 2011-05-02 2015-07-28 Anindya SAHA Vector slot processor execution unit for high speed streaming inputs
CN104040542B (en) * 2011-12-08 2017-10-10 甲骨文国际公司 For the technology for the column vector that relational data is kept in volatile memory
US9557993B2 (en) * 2012-10-23 2017-01-31 Analog Devices Global Processor architecture and method for simplifying programming single instruction, multiple data within a register
CN103105775B (en) * 2012-12-17 2014-04-16 清华大学 Layering iterative optimization scheduling method based on order optimization and online core limitation learning machine
CN103020018B (en) * 2012-12-27 2015-09-30 南京师范大学 A kind of compressed sensing Matrix Construction Method based on multidimensional pseudo-random sequence
US9282014B2 (en) * 2013-01-23 2016-03-08 International Business Machines Corporation Server restart management via stability time
US10437600B1 (en) * 2017-05-02 2019-10-08 Ambarella, Inc. Memory hierarchy to transfer vector data for operators of a directed acyclic graph

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7917302B2 (en) * 2000-09-28 2011-03-29 Torbjorn Rognes Determination of optimal local sequence alignment similarity score
CN101083525A (en) * 2005-12-30 2007-12-05 英特尔公司 Cryptography processing units and multiplier
TW201020805A (en) * 2008-10-08 2010-06-01 Advanced Risc Mach Ltd Apparatus and method for performing SIMD multiply-accumulate operations
US20140207838A1 (en) * 2011-12-22 2014-07-24 Klaus Danne Method, apparatus and system for execution of a vector calculation instruction
TW201439902A (en) * 2012-12-27 2014-10-16 Nvidia Corp Fault detection in instruction translations
US20140289502A1 (en) * 2013-03-19 2014-09-25 Apple Inc. Enhanced vector true/false predicate-generating instructions

Also Published As

Publication number Publication date
WO2017021055A1 (en) 2017-02-09
EP3125109B1 (en) 2019-02-20
EP3125109A1 (en) 2017-02-01
JP6818010B2 (en) 2021-01-20
KR102586258B1 (en) 2023-10-10
IL256403A (en) 2018-02-28
CN107851022B (en) 2022-05-17
IL256403B (en) 2019-08-29
CN107851022A (en) 2018-03-27
US11314514B2 (en) 2022-04-26
JP2018521422A (en) 2018-08-02
KR20180037961A (en) 2018-04-13
TW201717051A (en) 2017-05-16
US20180196673A1 (en) 2018-07-12

Similar Documents

Publication Publication Date Title
TWI476684B (en) Method and apparatus for performing a gather stride instruction and a scatter stride instruction in a computer processor
CN109471659B (en) System, apparatus, and method for blending two source operands into a single destination using a writemask
TWI525533B (en) Systems, apparatuses, and methods for performing mask bit compression
JP5918287B2 (en) Instruction processing apparatus, method, system, and program for consolidating unmasked elements of operation mask
JP2018504666A (en) Hardware apparatus and method for prefetching multidimensional blocks of elements from a multidimensional array
US11301580B2 (en) Instruction execution that broadcasts and masks data values at different levels of granularity
US6601158B1 (en) Count/address generation circuitry
JP2017529601A (en) Bit shuffle processor, method, system, and instructions
JP2014510352A (en) System, apparatus, and method for register alignment
KR20170065587A (en) Morton coordinate adjustment processors, methods, systems, and instructions
TWI721999B (en) Vector length querying instruction
TWI739754B (en) Vector arithmetic instruction
KR20130140143A (en) Systems, apparatuses, and methods for jumps using a mask register
JP2012119009A5 (en) A processor that performs a selection operation
JP2018521421A (en) Vector operand bit size control
JP2018500629A (en) Machine level instruction to calculate 3D Z-curve index from 3D coordinates
KR102591988B1 (en) Vector interleaving in data processing units
TW201732571A (en) Systems, apparatuses, and methods for getting even and odd data elements
JP5327432B2 (en) Signal processor and semiconductor device
CN115859315A (en) System, apparatus and method for direct peripheral access to secure storage
JPH0192851A (en) Switching device for address space
JP5311008B2 (en) Signal processor and semiconductor device
JP5263498B2 (en) Signal processor and semiconductor device