CN117132450B - Computing device capable of realizing data sharing and graphic processor - Google Patents

Computing device capable of realizing data sharing and graphic processor Download PDF

Info

Publication number
CN117132450B
CN117132450B CN202311376818.9A CN202311376818A CN117132450B CN 117132450 B CN117132450 B CN 117132450B CN 202311376818 A CN202311376818 A CN 202311376818A CN 117132450 B CN117132450 B CN 117132450B
Authority
CN
China
Prior art keywords
data
input parameter
parameter
unit
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311376818.9A
Other languages
Chinese (zh)
Other versions
CN117132450A (en
Inventor
冯雨
杨喜乐
格雷格·克拉克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xindong Microelectronics Technology Wuhan Co ltd
Original Assignee
Xindong Microelectronics Technology Wuhan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xindong Microelectronics Technology Wuhan Co ltd filed Critical Xindong Microelectronics Technology Wuhan Co ltd
Priority to CN202311376818.9A priority Critical patent/CN117132450B/en
Publication of CN117132450A publication Critical patent/CN117132450A/en
Application granted granted Critical
Publication of CN117132450B publication Critical patent/CN117132450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Logic Circuits (AREA)

Abstract

The invention discloses a computing device and a graphics processor capable of realizing data sharing. The computing device comprises a shared memory, a constant register, a multiplexing unit and a plurality of data processing modules; each data processing module comprises a data unit, a data transmission unit and an operation module; the multiplexing unit is used for acquiring data from the data units of the plurality of data processing modules and respectively providing the acquired data to the data transmission units of the plurality of data processing modules according to the first control signal lm_mux; the data transmission unit of each data processing module is used for acquiring data from the data units in the same data processing module, acquiring constants from the constant registers, acquiring data from the multiplexing unit and acquiring data from the shared memory. The invention can obviously improve the operation efficiency through multi-level data sharing, has simple structure and easy realization, and is suitable for various complex operation scenes.

Description

Computing device capable of realizing data sharing and graphic processor
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a computing device and a graphics processor capable of realizing data sharing.
Background
Graphics processor (Graphics Processing Unit, GPU), which is a microprocessor dedicated to image and graphics related operations, is an important component of the graphics system architecture, and is a ligament connecting computers and display terminals. In real-time graphics and video applications, graphics processors are required to have more powerful general purpose computing capabilities. Current data sharing between GPU threads is difficult to achieve and multiple motion (mov) instructions must be used to accomplish the corresponding function. In general, a mov instruction needs multiple cycles to complete corresponding functions due to a bank conflict, so that the mov instruction becomes a bottleneck of the whole program, and the computing capability of a graphics processor is seriously affected.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides a computing device and a graphics processor capable of realizing data sharing, which are simple in structure, easy to realize and suitable for various complex operation scenes through multistage data sharing.
To achieve the above object, according to one aspect of the present invention, there is provided a computing device including a shared memory, a constant register, a multiplexing unit, and a plurality of data processing modules; each data processing module comprises a data unit, a data transmission unit and an operation module; the multiplexing unit is used for acquiring data from the data units of the plurality of data processing modules and respectively providing the acquired data to the data transmission units of the plurality of data processing modules according to the first control signal lm_mux; the data transmission unit of each data processing module is used for acquiring data from the data unit in the same data processing module and giving a first parameter s0, acquiring constant from a constant register and giving a second parameter s1, acquiring data from a multiplexing unit and giving a third parameter s2, and acquiring data from a shared memory and giving a fourth parameter s3; the operation module of each data processing module is used for executing corresponding operation according to the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3.
In some embodiments, the data provided by the data units marking each data processing module is respectively us0 to us (M-1), wherein M is the number of the plurality of data processing modules; correspondingly, the numbers of the operation modules of the data processing modules are marked as M, m=0, 1, … and M-1; first control signal lm_mux=0, 1, …, M-1; the multiplexing unit is used for carrying out logic operation on the value of the first control signal and the number M of the operation module of each data processing module, and respectively providing the data us0 to us (M-1) provided by the data unit of each data processing module to the data transmission unit of each data processing module according to the result of the logic operation.
In some embodiments, for the operation module with the number m, the multiplexing unit is configured to provide the data us (Y) to the data transmission unit in the same data processing module as the operation module with the number m, where Y is a logical operation result of the value of the first control signal lm_mux and m.
In some embodiments, the operation module of each data processing module is denoted as Gm, where m=0, 1, …, M-1, M is the number of the plurality of data processing modules; each operation module Gm comprises N operation units, denoted Un, where N is the number of the operation units, n=0, 1, …, N-1; the data transmission unit in the same data processing module as the operation module Gm comprises a data distribution unit; the data distribution unit is used for providing a first input parameter s0', a second input parameter s1', a third input parameter s2 'and a fourth input parameter s3' for each operation unit according to the values of the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3 and the numbers of the operation units; each operation unit is used for executing corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3'.
In some embodiments, the first input parameter s0'[ n ] =s0 [ n ] of the operation unit Un, the second input parameter s1' [ n ] =s1 [ n ] of the operation unit Un, the third input parameter s2'[ n ] =s2 [ n ] of the operation unit Un, the fourth input parameter s3' [ n ] =s3 [ n ] of the operation unit Un, wherein s0[ n ] is the n+1th value of the first parameter s0, s1[ n ] is the n+1th value of the second parameter s1, s2[ n ] is the n+1th value of the third parameter s2, and s3[ n ] is the n+1th value of the fourth parameter s3.
In some embodiments, the data allocation unit is configured to perform a logical operation on the value of the second control signal shuf_oper and the number of each operation unit, and according to the result of the logical operation, value the first input parameter s0 'assigned to each operation unit from the first parameter s0, value the second input parameter s1' assigned to each operation unit from the second parameter s1, value the third input parameter s2 'assigned to each operation unit from the third parameter s2, and value the fourth input parameter s3' assigned to each operation unit from the fourth parameter s3.
In some implementations, the second control signal shuf_oper includes shuf_oper0, shuf_oper1, shuf_oper2, and shuf_oper3; for the operation unit Un, the data distribution unit is configured to perform a logic operation on the value of the second control signal shuf_oper0 and the number n of the operation unit Un, and assign a value from the first parameter s0 to the first input parameter s0' n of the operation unit Un according to the result of the logic operation; the data distribution unit is used for carrying out logic operation on the value of the second control signal shuf_oper1 and the number n of the operation unit Un, and giving the value to a second input parameter s1' n of the operation unit Un from the second parameter s1 according to the result of the logic operation; the data distribution unit is used for carrying out logic operation on the value of the second control signal shuf_oper2 and the number n of the operation unit Un, and giving the value of the third parameter s2 to the third input parameter s2' n of the operation unit Un according to the result of the logic operation; the data distribution unit is used for carrying out logic operation on the value of the second control signal shuf_oper3 and the number n of the operation unit Un, and according to the result of the logic operation, the value of the fourth parameter s3 is given to the fourth input parameter s3' n of the operation unit Un.
In some embodiments, for the operation unit Un, the data distribution unit is configured to assign the data s0[ Z0] to the first input parameter s0'[ n ] of the operation unit Un, the data distribution unit is configured to assign the data s1[ Z1] to the second input parameter s1' [ n ] of the operation unit Un, the data distribution unit is configured to assign the data s2[ Z2] to the third input parameter s2'[ n ] of the operation unit Un, the data distribution unit is configured to assign the data s3[ Z3] to the fourth input parameter s3' [ n ] of the operation unit Un, wherein Z0 is a logical operation result of n and the second control signal shuf_oper0, Z1 is a logical operation result of n and the second control signal shuf_oper1, Z2 is a logical operation result of n and the second control signal shuf_oper2, and Z3 is a logical operation result of n and the second control signal shuf_oper 3.
In some embodiments, the data transmission unit in the same data processing module as the operation module Gm further includes a data exchange module; the data exchange module is used for exchanging data of two of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' of each operation unit according to preset data exchange logic when preset data exchange conditions are met; each operation unit is used for executing corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' after data exchange is completed.
In some embodiments, for the operation unit Un, the data exchange module is configured to determine whether the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit Un can be exchanged according to the enable signal cha_able_s0's2', and when the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit can be exchanged, the data exchange module is further configured to set a determination condition according to change_s0's2', and when the determination condition is met, exchange the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit; the data exchange module is further configured to determine whether the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged according to the enable signal cha_able_s1's3', and when the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged, the data exchange module further sets a determination condition according to change_s1's3', and when the determination condition is met, the data exchange module exchanges the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit.
In some embodiments, each arithmetic unit corresponds to a thread, and for an arithmetic unit Un with a number N in the arithmetic module Gm with a number m, the thread number t=n+m×n corresponding to the arithmetic unit Un; setting parameter s=ceil (log) 2 (m×n)), wherein ceil represents an upward integer; setting the exchange flag 1= (T>>change_s0’s2’)&1, exchange flag 2= (T>>change_s1’s3’)&1, a step of; according to the value of the exchange flag1, determining whether to exchange the data of the first input parameter and the third input parameter of the operation unit corresponding to the thread; determining whether to use the data of the second input parameter and the fourth input parameter of the operation unit corresponding to the thread according to the value of the exchange flag2Exchange is performed.
According to another aspect of the present invention, there is provided a graphics processor including the computing device described above.
In general, the above technical solutions conceived by the present invention have the following beneficial effects compared with the prior art: the first-stage data sharing is realized by adding the multiplexing unit, the second-stage data sharing is realized by improving the data transmission unit and adding the logic operation module in the data transmission unit, the third-stage data sharing is realized by adding the data exchange module after the logic operation module of the data transmission unit, and the specific mode of data sharing can be controlled according to actual calculation requirements, so that complex operation is completed while action instructions are greatly reduced, and the operation efficiency is remarkably improved. The invention has simple structure and easy realization, and is especially suitable for various complex operation scenes.
Drawings
FIG. 1 is a schematic diagram of a computing device capable of data sharing according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a structure of a data transmission unit to an operation module according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating a structure of a data transmission unit to an operation module according to another embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. As will be recognized by those of skill in the pertinent art, the described embodiments may be modified in various different ways without departing from the spirit or scope of the present application. Accordingly, the drawings and description are to be regarded as illustrative in nature and not as restrictive.
As shown in fig. 1, the computing device capable of implementing data sharing according to an embodiment of the present invention includes four data processing modules, a shared memory, a constant register, a multiplexing unit, and a local memory. The data processing modules are respectively a data processing module 101, a data processing module 103, a data processing module 105 and a data processing module 107, and each data processing module comprises a data unit, a data transmission unit and an operation module. The data processing module 101 includes a data unit 109, a data transmission unit 111, and an operation module G0, the data processing module 103 includes a data unit 113, a data transmission unit 115, and an operation module G1, the data processing module 105 includes a data unit 117, a data transmission unit 119, and an operation module G2, and the data processing module 107 includes a data unit 121, a data transmission unit 123, and an operation module G3.
The data unit is used for providing data. Specifically, data unit 109 is used to provide data us0, data unit 113 is used to provide data us1, data unit 117 is used to provide data us2, and data unit 121 is used to provide data us3. When the computing device includes M data processing modules, the data units are used to provide data usm, m=0, 1, …, M-1, m=2 k1 K1 is a natural number, i.e., M is an exponential multiple of 2. The constant register is used for providing constant; the multiplexing unit is used for acquiring data us0 to us3 (generally us0 to us (M-1)) of each data unit and distributing the data us0 to us3 (generally us0 to us (M-1)) of each data unit to each data processing module according to the input first control signal lm_mux. The shared memory is used to provide data for the individual data processing modules.
The data transmission unit is used for acquiring data of a data unit in the same data processing module and giving a first parameter s0, acquiring a constant provided by a constant register and giving a second parameter s1, acquiring data provided by the multiplexing unit and giving a third parameter s2, and acquiring data provided by the shared memory and giving a fourth parameter s3. Taking the data processing module 101 as an example, the data transmission unit 111 obtains the data us0 of the data unit 109 to assign to the first parameter s0, obtains the constant provided by the constant register to assign to the second parameter s1, obtains one of the data provided by the multiplexing unit (us 0 to us3, specifically, which data is determined by the input first control signal lm_mux) to assign to the third parameter s2, and obtains the data provided by the shared memory to assign to the fourth parameter s3.
The data transmission unit is used for providing the assigned first parameter s0, second parameter s1, third parameter s2 and fourth parameter s3 for an operation module in the same data processing module, and the operation module executes corresponding operation according to the first parameter s0, second parameter s1, third parameter s2 and fourth parameter s3. Taking the data processing module 101 as an example, the data transmission unit 111 provides the assigned first parameter s0, second parameter s1, third parameter s2 and fourth parameter s3 to the operation module G0, and the operation module G0 performs corresponding operations according to the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3.
It should be appreciated that the values of the first parameter s0 and the third parameter s2 are likely to be different for different data transmission units when the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3 are four inputs as data transmission units.
The number of the operation module G0 is 0, the number of the operation module G1 is 1, the number of the operation module G2 is 2, and the number of the operation module G3 is 3. Generally, the number of the computing module Gm is M, and when the computing device includes M data processing modules, m=0, 1, …, M-1, and correspondingly lm_mux=0, 1, …, M-1, that is, the number of lm_mux corresponds to the number of the computing modules. In some embodiments, the multiplexing unit performs a logic operation on the value of the first control signal lm_mux and the number of the operation module, and provides the data us0 to us3 (generally us0 to us (M-1)) of each data unit to each data transmission unit according to the result of the logic operation.
In some embodiments, for the computing module Gm, the multiplexing unit is configured to provide the data us (Y) to the data transmission unit in the same data processing module as the computing module Gm, where Y is a logical operation result of the value of the first control signal lm_mux and m.
Taking the case that the computing device shown in fig. 1 includes four data processing modules as an example, the multiplexing unit performs exclusive or (XOR) logic operation on the value of the first control signal lm_mux and the number of the operation module, and data provided to each data transmission unit is shown in the following table one.
List one
lm_mux G0 G1 G2 G3
0 us0 us1 us2 us3
1 us1 us0 us3 us2
2 us2 us3 us0 us1
3 us3 us2 us1 us0
It can be seen that, when lm_mux is 0, as for the operation module G0, the result of XOR logical operation of the value of lm_mux and the number 0 of the operation module is 0, the multiplexing unit provides the data us0 to the data transmission unit 111, and the data transmission unit 111 assigns the value of us0 to the third parameter s2; similarly, the data transmission unit 115 assigns the value of us1 to the third parameter s2, the data transmission unit 119 assigns the value of us2 to the third parameter s2, and the data transmission unit 123 assigns the value of us3 to the third parameter s2.
In some embodiments, each arithmetic module comprises N arithmetic units, denoted as arithmetic unit Un, where n=0, 1, …, N-1, n=2 k2 K2 is a natural number, i.e., N is an exponential multiple of 2, and the number of the arithmetic unit Un is N. As shown in fig. 2, the data transmission unit includes a data distribution unit, and the data distribution unit provides the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' for each operation unit according to the values of the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3 and the number of each operation unit. Each arithmetic unit Un performs a corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3'.
In some embodiments, the first input parameter s0'[ n ] =s0 [ n ] of the operation unit Un, the second input parameter s1' [ n ] =s1 [ n ] of the operation unit Un, the third input parameter s2'[ n ] =s2 [ n ] of the operation unit Un, the fourth input parameter s3' [ n ] =s3 [ n ] of the operation unit Un, wherein s0[ n ] is the n+1th value of the first parameter s0, s1[ n ] is the n+1th value of the second parameter s1, s2[ n ] is the n+1th value of the third parameter s2, and s3[ n ] is the n+1th value of the fourth parameter s3.
In order to further realize data sharing, a logic operation function is added to the data allocation unit shown in fig. 2. Specifically, the data distribution unit performs a logical operation on the value of the second control signal shuf_oper and the number of each operation unit, and based on the result of the logical operation, the data distribution unit assigns a value to the first input parameter s0 'of each operation unit from the first parameter s0, assigns a value to the second input parameter s1' of each operation unit from the second parameter s1, assigns a value to the third input parameter s2 'of each operation unit from the third parameter s2, and assigns a value to the fourth input parameter s3' of each operation unit from the fourth parameter s3.
In some embodiments, the second control signal shuf_oper includes a number of shuf_oper0, a number of shuf_oper1, a number of shuf_oper2, and a number of shuf_oper3, i.e., the number of shuf_opers corresponds to the number of first parameter s0, second parameter s1, third parameter s2, and fourth parameter s3.
In some embodiments, for the operation unit Un, the data distribution unit performs a logic operation on the value of the second control signal shuf_oper0 and the number n of the operation unit Un, and according to the result of the logic operation, takes a value from the first parameter s0 to give the first input parameter s0' n of the operation unit Un; the data distribution unit carries out logic operation on the value of the second control signal shuf_oper1 and the number n of the operation unit Un, and according to the result of the logic operation, the data distribution unit takes the value from the second parameter s1 and gives the value to the second input parameter s1' n of the operation unit Un; the data distribution unit carries out logic operation on the value of the second control signal shuf_oper2 and the number n of the operation unit Un, and according to the result of the logic operation, the data distribution unit takes the value from the third parameter s2 and gives the value to the third input parameter s2' n of the operation unit Un; the data distribution unit performs a logical operation on the value of the second control signal shuf_oper3 and the number n of the operation unit Un, and assigns a value from the fourth parameter s3 to the fourth input parameter s3' n of the operation unit Un according to the result of the logical operation.
In some embodiments, for the operation unit Un, the data distribution unit is configured to assign the data s0[ Z0] to the first input parameter s0'[ n ] of the operation unit Un, the data distribution unit is configured to assign the data s1[ Z1] to the second input parameter s1' [ n ] of the operation unit Un, the data distribution unit is configured to assign the data s2[ Z2] to the third input parameter s2'[ n ] of the operation unit Un, the data distribution unit is configured to assign the data s3[ Z3] to the fourth input parameter s3' [ n ] of the operation unit Un, wherein Z0 is a logical operation result of n and the second control signal shuf_oper0, Z1 is a logical operation result of n and the second control signal shuf_oper1, Z2 is a logical operation result of n and the second control signal shuf_oper2, and Z3 is a logical operation result of n and the second control signal shuf_oper 3.
Taking the example of exclusive or (XOR) logical operation of the value of the second control signal shuf_oper by the data distribution unit with the number n of the operation unit Un, the first input parameter s0'[ n ], the second input parameter s1' [ n ], the third input parameter s2'[ n ] and the fourth input parameter s3' [ n ] of the operation unit Un can be expressed as:
s0’[n] = s0[n XOR shuf_oper0],
s1’[n] = s1[n XOR shuf_oper1],
s2’[n] = s2[n XOR shuf_oper2],
s3’[n] = s3[n XOR shuf_oper3]。
as shown in fig. 3, in order to further realize data sharing, a data exchange module is introduced after the data distribution unit. The data exchange module exchanges data with two of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' of each operation unit according to preset data exchange logic when preset data exchange conditions are met. Each operation unit executes corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' after data exchange is completed.
Specifically, the data exchange module receives the data output from the data distribution unit. For the operation unit Un, the data exchange module judges whether the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit Un can be exchanged according to the enabling signal cha_able_s0's2', when the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit can be exchanged, the condition parameter change_s0's2' is valid, the data exchange module sets the judging condition according to the change_s0's2', and when the judging condition is met, the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit are exchanged. Similarly, the data exchange module determines whether the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged according to the enable signal cha_able_s1's3', and when the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged, the condition parameter change_s1's3' is valid, and the data exchange module further sets a determination condition according to the change_s1's3', and when the determination condition is met, the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit are exchanged.
In some embodiments, when the enable signal cha_able_s0's2' is 1, the data exchange module determines that the data of the first input parameter s0 'and the third input parameter s2' of the operation unit can be exchanged, and further sets a determination condition according to change_s0's 2'; when the enabling signal cha_able_s1's3' is 1, the data exchange module judges that the data of the second input parameter s1 'and the fourth input parameter s3' of the operation unit can be exchanged, and further sets a judging condition according to change_s1's 3'.
In some embodiments, each arithmetic unit corresponds to a thread, and for a computing device including M arithmetic modules, each arithmetic module includes N arithmetic units, and includes a total of m×n threads.
For an arithmetic unit Un numbered N in the arithmetic module Gm numbered m, the thread number t=n+m×n corresponding to the arithmetic unit Un. For example, for a computing device comprising 4 computing modules, each computing module comprises 16 computing units, the thread number T takes a value within 0 to 63.
Setting parameter s=ceil (log) 2 (m×n)), where ceil represents an upward integer, for example, ceil (2) =2, ceil (2.01) =3, ceil (1.99) =2, and the condition parameters change_s0'S2' and change_s1'S3' can take values within 0 to S-1, respectively.
Setting exchange marks of flag 1= (T > > change_s0's 2') &1, and determining whether to exchange data of a first input parameter and a third input parameter of an operation unit corresponding to a thread according to the value of the exchange marks of flag1, wherein the exchange marks of flag 2= (T > > change_s1's 3') & 1; and according to the value of the exchange flag2, determining whether to exchange the data of the second input parameter and the fourth input parameter of the operation unit corresponding to the thread.
In some embodiments, when the exchange flag1 is 1, exchanging data of the first input parameter and the third input parameter of the operation unit corresponding to the thread; and when the exchange flag2 is 1, exchanging the data of the second input parameter and the fourth input parameter of the operation unit corresponding to the thread.
For a computing device comprising 4 computing modules, each computing module comprising 16 computing units, in one example, the data exchange operation is performed using the method described above, resulting in the exchange flags for each thread as shown in table two below.
Watch II
cha_able_s0’ s2’/ cha_ able_s1’s3 change_s0’ s2’/ change_ s1’s3’ flag1/flag2 (thread number 0-63)
1 5 0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_ 1111_1111_1111_1111_1111
1 4 0000_0000_0000_0000_1111_1111_1111_1111_0000_0000_0000_ 0000_1111_1111_1111_1111
1 3 0000_0000_1111_1111_0000_0000_1111_1111_0000_0000_1111_ 1111_0000_0000_1111_1111
1 2 0000_1111_0000_1111_0000_1111_0000_1111_0000_1111_0000_ 1111_0000_1111_0000_1111
1 1 0011_0011_0011_0011_0011_0011_0011_0011_0011_0011_0011_ 0011_0011_0011_0011_0011
1 0 0101_0101_0101_0101_0101_0101_0101_0101_0101_0101_0101_ 0101_0101_0101_0101_0101
0 x 0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_ 0000_0000_0000_0000_0000
It can be seen that when the enable signal cha_able_s0's2' is 0, the value of change_s0's2' is x, which is an invalid input; when the enable signal cha_able_s0's2' is 1, change_s0's2' takes a value within 0 to 5. Taking change_s0's2' =5 as an example, the value of flag1 is:
0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111;
therefore, the logical operation module does not exchange the data of the first input parameter s0 'and the third input parameter s2' of the operation unit corresponding to the thread with the number 0, but keeps the data generated by the logical operation module unchanged; the logical operation module exchanges data of the first input parameter s0 'and the third input parameter s2' of the operation unit corresponding to the thread with the number 63, wherein the flag 1=1 corresponding to the thread with the number 63.
It should be understood that, with respect to the data transmission unit structure shown in fig. 3, the data distribution unit may or may not have a logic operation function. In some embodiments, the data distribution unit does not have a logic operation function, that is, two-stage data sharing is realized through the multiplexing unit and the data exchange module at this time; in other embodiments, the data distribution unit has a logical operation function, that is, three-level data sharing is realized through the multiplexing unit, the data distribution unit and the data exchange module.
The computing device capable of realizing data sharing according to the present invention will be described in detail below by taking a basic operation unit butterfly operation of an FFT performed by a computing device including 2 operation units each including 2 operation units as an example.
The calculation formula of the butterfly operation of the basic operation unit of the FFT is as follows:
setting q=2, the calculation formula of the butterfly operation is:
the method further comprises the following steps:
setting lm_mux to 1, s2 taking the data from the multiplexing unit, s0 taking the data from the data unit; s1 fetches data from the constant registers, each thread will fetch the same data 1; s3 fetches data from shared memory.
The values of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' of the operation unit corresponding to each thread output by the data distribution unit are shown in the following table three.
Watch III
Setting cha_able_s0's2' =1, cha_able_s1's3' =1, change_s0's2' =2, change_s1's3' =2. At this time, the data of the first input parameter s0 'and the third input parameter s2' of the threads numbered 2 and 3 are exchanged, and the data of the second input parameter s1 'and the fourth input parameter s3' are exchanged. After the data exchange, the data exchange module outputs the values of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' to each operation unit as shown in the following table four.
Table four
The arithmetic unit corresponding to each thread executes the multiply-add unit according to the values of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' in the fourth table, so as to complete the calculation of the following formula:
therefore, the two-stage data sharing structure of the embodiment of the invention can complete the operation by only one instruction.
For the following calculation formula:
the data distribution unit has a three-level data sharing structure with a logic operation function, wherein shuf_opener2=1 is set, and the values of a first input parameter s0', a second input parameter s1', a third input parameter s2 'and a fourth input parameter s3' of an operation unit corresponding to each thread output by the data distribution unit are shown in the following table five.
TABLE five
Setting cha_able_s0's2' =1, cha_able_s1's3' =1, change_s0's2' =1, change_s1's3' =1. At this time, the data of the first input parameter s0 'and the third input parameter s2' of the threads numbered 1 and 3 are exchanged, and the data of the second input parameter s1 'and the fourth input parameter s3' are exchanged. After the data exchange, the data exchange module outputs the values of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' to each operation unit as shown in the following table six.
TABLE six
The arithmetic unit corresponding to each thread executes the multiply-add unit according to the values of the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' in the fourth table, so as to complete the calculation of the following formula:
therefore, the three-level data sharing structure of the embodiment of the invention can complete the operation by only one instruction.
The invention also provides a graphic processor comprising the computing device. By the computing device capable of realizing data sharing, the graphics processor can more efficiently complete processing of various graphics data.
The multistage data sharing structure can greatly reduce action instructions, and remarkably improve operation efficiency while completing complex operation. And the method can control the specific mode of data sharing according to actual calculation requirements, has a simple structure, is easy to realize, and is particularly suitable for various complex operation scenes.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Any process or method description in a flowchart or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more (two or more) executable instructions for implementing specific logical functions or steps of the process. And the scope of the preferred embodiments of the present application includes additional implementations in which functions may be performed in a substantially simultaneous manner or in an opposite order from that shown or discussed, including in accordance with the functions that are involved.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. All or part of the steps of the methods of the embodiments described above may be performed by a program that, when executed, comprises one or a combination of the steps of the method embodiments, instructs the associated hardware to perform the method.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules described above, if implemented in the form of software functional modules and sold or used as a stand-alone product, may also be stored in a computer-readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of various changes or substitutions within the technical scope of the present application, and these should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A computing device comprising a shared memory, a constant register, a multiplexing unit, and a plurality of data processing modules; each data processing module comprises a data unit, a data transmission unit and an operation module;
marking data provided by the data units of the data processing modules as us0 to us (M-1), wherein M is the number of the data processing modules; correspondingly, the number of the operation module of each data processing module is marked as M, m=0, 1, … and M-1; first control signal lm_mux=0, 1, …, M-1;
the multiplexing unit is used for carrying out logic operation on the value of the first control signal lm_mux and the number M of the operation module of each data processing module, and respectively providing the data us0 to us (M-1) provided by the data units of each data processing module to the data transmission units of each data processing module according to the result of the logic operation; for the arithmetic module with the number m, the multiplexing unit is used for providing the data us (Y) to a data transmission unit in the same data processing module as the arithmetic module with the number m, wherein Y is the logical operation result of the value of the first control signal lm_mux and m;
the data transmission unit of each data processing module is used for acquiring data from the data unit in the same data processing module and giving a first parameter s0, acquiring constant from the constant register and giving a second parameter s1, acquiring data from the multiplexing unit and giving a third parameter s2, and acquiring data from the shared memory and giving a fourth parameter s3;
the operation module of each data processing module is used for executing corresponding operation according to the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3.
2. The computing device of claim 1, wherein the computing module of each data processing module is labeled Gm, where M = 0, 1, …, M-1, M is the number of the plurality of data processing modules;
each operation module Gm comprises N operation units, denoted Un, where N is the number of the operation units, n=0, 1, …, N-1; the data transmission unit in the same data processing module as the operation module Gm comprises a data distribution unit;
the data distribution unit is used for providing a first input parameter s0', a second input parameter s1', a third input parameter s2 'and a fourth input parameter s3' for each operation unit according to the values of the first parameter s0, the second parameter s1, the third parameter s2 and the fourth parameter s3 and the numbers of the operation units;
each operation unit is used for executing corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3'.
3. The computing device of claim 2, wherein a first input parameter s0 'n of the computing unit Un = s0[ n ], a second input parameter s1' n of the computing unit Un = s1[ n ], a third input parameter s2 'n of the computing unit Un = s2[ n ], a fourth input parameter s3' n of the computing unit Un = s3[ n ], wherein s0[ n ] is a n+1th value of the first parameter s0, s1[ n ] is a n+1th value of the second parameter s1, s2[ n ] is a n+1th value of the third parameter s2, and s3[ n ] is a n+1th value of the fourth parameter s3.
4. The computing device according to claim 2, wherein the data distribution unit is configured to perform a logical operation on the value of the second control signal shuf_oper and the number of each operation unit, and to assign a value from the first parameter s0 to the first input parameter s0 'of each operation unit, a value from the second parameter s1 to the second input parameter s1' of each operation unit, a value from the third parameter s2 to the third input parameter s2 'of each operation unit, and a value from the fourth parameter s3 to the fourth input parameter s3' of each operation unit, based on the result of the logical operation.
5. The computing device of claim 4, wherein the second control signal shuf_oper includes shuf_oper0, shuf_oper1, shuf_oper2, and shuf_oper3;
for the operation unit Un, the data distribution unit is configured to perform a logic operation on the value of the second control signal shuf_oper0 and the number n of the operation unit Un, and assign a value from the first parameter s0 to the first input parameter s0' n of the operation unit Un according to the result of the logic operation; the data distribution unit is configured to perform a logic operation on the value of the second control signal shuf_open1 and the number n of the operation unit Un, and assign a value from the second parameter s1 to a second input parameter s1' [ n ] of the operation unit Un according to a result of the logic operation; the data distribution unit is configured to perform a logic operation on the value of the second control signal shuf_oper2 and the number n of the operation unit Un, and assign a value to a third input parameter s2' [ n ] of the operation unit Un from a third parameter s2 according to a result of the logic operation; the data distribution unit is configured to perform a logical operation on the value of the second control signal shuf_oper3 and the number n of the operation unit Un, and assign a value from the fourth parameter s3 to the fourth input parameter s3' n of the operation unit Un according to the result of the logical operation.
6. The computing device according to claim 5, wherein for the operation unit Un, the data distribution unit is configured to assign data s0[ Z0] to a first input parameter s0'[ n ] of the operation unit Un, the data distribution unit is configured to assign data s1[ Z1] to a second input parameter s1' [ n ] of the operation unit Un, the data distribution unit is configured to assign data s2[ Z2] to a third input parameter s2'[ n ] of the operation unit Un, the data distribution unit is configured to assign data s3[ Z3] to a fourth input parameter s3' [ n ] of the operation unit Un, wherein Z0 is a logical operation result of n and the second control signal shuf_oper0, Z1 is a logical operation result of n and the second control signal shuf_oper1, Z2 is a logical operation result of n and the second control signal shuf_oper2, and Z3 is a logical operation result of n and the second control signal shuf_oper 3.
7. The computing device of any one of claims 2 to 6, wherein the data transmission unit in the same data processing module as the computing module Gm further comprises a data exchange module;
the data exchange module is used for exchanging data of two of a first input parameter s0', a second input parameter s1', a third input parameter s2 'and a fourth input parameter s3' of each operation unit according to preset data exchange logic when preset data exchange conditions are met;
each operation unit is used for executing corresponding operation according to the first input parameter s0', the second input parameter s1', the third input parameter s2 'and the fourth input parameter s3' after data exchange is completed.
8. The computing device of claim 7, wherein for the operation unit Un, the data exchange module is configured to determine whether the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit Un can be exchanged according to the enable signal cha_able_s0's2', and when the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit can be exchanged, the data exchange module is further configured to set a determination condition according to change_s0's2', and when the determination condition is met, exchange the data of the first input parameter s0 'n and the third input parameter s2' n of the operation unit; the data exchange module is further configured to determine whether the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged according to the enable signal cha_able_s1's3', and when the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit can be exchanged, the data exchange module is further configured to set a determination condition according to change_s1's3', and when the determination condition is met, exchange the data of the second input parameter s1 'n and the fourth input parameter s3' n of the operation unit.
9. The computing device of claim 8, wherein each arithmetic unit corresponds to a thread, and for an arithmetic unit Un numbered N in an arithmetic module Gm numbered m, the thread number T = n+m x N corresponding to the arithmetic unit Un;
setting parameter s=ceil (log) 2 (m×n)), wherein ceil represents an upward integer;
setting exchange marks of flag 1= (T > > change_s0's 2') &1, and exchange marks of flag 2= (T > > change_s1's 3') & 1; wherein, the change_s0'S2' and the change_s1'S3' respectively take values from 0 to S-1;
according to the value of the exchange flag1, determining whether to exchange the data of the first input parameter and the third input parameter of the operation unit corresponding to the thread; and according to the value of the exchange flag2, determining whether to exchange the data of the second input parameter and the fourth input parameter of the operation unit corresponding to the thread.
10. A graphics processor comprising the computing device of any one of claims 1 to 9.
CN202311376818.9A 2023-10-24 2023-10-24 Computing device capable of realizing data sharing and graphic processor Active CN117132450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311376818.9A CN117132450B (en) 2023-10-24 2023-10-24 Computing device capable of realizing data sharing and graphic processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311376818.9A CN117132450B (en) 2023-10-24 2023-10-24 Computing device capable of realizing data sharing and graphic processor

Publications (2)

Publication Number Publication Date
CN117132450A CN117132450A (en) 2023-11-28
CN117132450B true CN117132450B (en) 2024-02-20

Family

ID=88854855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311376818.9A Active CN117132450B (en) 2023-10-24 2023-10-24 Computing device capable of realizing data sharing and graphic processor

Country Status (1)

Country Link
CN (1) CN117132450B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332742B (en) * 2023-12-01 2024-02-23 芯动微电子科技(武汉)有限公司 Simulation verification method and device for chip design stage

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994002894A2 (en) * 1992-07-13 1994-02-03 Texas Instruments France Data-processing system with a device for handling program loops
CN101449239A (en) * 2006-05-25 2009-06-03 高通股份有限公司 Graphics processor with arithmetic and elementary function units
CN101504599A (en) * 2009-03-16 2009-08-12 西安电子科技大学 Special instruction set micro-processing system suitable for digital signal processing application
EP2159690A1 (en) * 2007-06-20 2010-03-03 Fujitsu Limited Information processing unit and method for controlling register
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
CN102047241A (en) * 2008-05-30 2011-05-04 先进微装置公司 Local and global data share
EP2447853A1 (en) * 2010-10-29 2012-05-02 Elio Strollo Multiprocessor with private and shared memories
CN103744644A (en) * 2014-01-13 2014-04-23 上海交通大学 Quad-core processor system built in quad-core structure and data switching method thereof
US9665969B1 (en) * 2009-09-29 2017-05-30 Nvidia Corporation Data path and instruction set for packed pixel operations for video processing
CN106779057A (en) * 2016-11-11 2017-05-31 北京旷视科技有限公司 The method and device of the calculating binary neural network convolution based on GPU
EP3396524A1 (en) * 2017-04-28 2018-10-31 INTEL Corporation Instructions and logic to perform floating-point and integer operations for machine learning
CN112130752A (en) * 2019-06-24 2020-12-25 英特尔公司 Shared local memory read merge and multicast return
CN112819681A (en) * 2019-11-15 2021-05-18 英特尔公司 Enhanced processor functionality for computing
CN113495865A (en) * 2020-03-20 2021-10-12 辉达公司 Asynchronous data movement pipeline
CN113961875A (en) * 2017-05-08 2022-01-21 辉达公司 Generalized acceleration of matrix multiply-accumulate operations
CN114830082A (en) * 2019-11-15 2022-07-29 苹果公司 SIMD operand arrangement selected from multiple registers
CN116185565A (en) * 2022-12-29 2023-05-30 芯动微电子科技(武汉)有限公司 Memory data isolation and sharing system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8176265B2 (en) * 2006-10-30 2012-05-08 Nvidia Corporation Shared single-access memory with management of multiple parallel requests
US8639882B2 (en) * 2011-12-14 2014-01-28 Nvidia Corporation Methods and apparatus for source operand collector caching

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994002894A2 (en) * 1992-07-13 1994-02-03 Texas Instruments France Data-processing system with a device for handling program loops
CN101449239A (en) * 2006-05-25 2009-06-03 高通股份有限公司 Graphics processor with arithmetic and elementary function units
US7680988B1 (en) * 2006-10-30 2010-03-16 Nvidia Corporation Single interconnect providing read and write access to a memory shared by concurrent threads
EP2159690A1 (en) * 2007-06-20 2010-03-03 Fujitsu Limited Information processing unit and method for controlling register
CN102047241A (en) * 2008-05-30 2011-05-04 先进微装置公司 Local and global data share
CN101504599A (en) * 2009-03-16 2009-08-12 西安电子科技大学 Special instruction set micro-processing system suitable for digital signal processing application
US9665969B1 (en) * 2009-09-29 2017-05-30 Nvidia Corporation Data path and instruction set for packed pixel operations for video processing
EP2447853A1 (en) * 2010-10-29 2012-05-02 Elio Strollo Multiprocessor with private and shared memories
CN103744644A (en) * 2014-01-13 2014-04-23 上海交通大学 Quad-core processor system built in quad-core structure and data switching method thereof
CN106779057A (en) * 2016-11-11 2017-05-31 北京旷视科技有限公司 The method and device of the calculating binary neural network convolution based on GPU
EP3396524A1 (en) * 2017-04-28 2018-10-31 INTEL Corporation Instructions and logic to perform floating-point and integer operations for machine learning
CN113961875A (en) * 2017-05-08 2022-01-21 辉达公司 Generalized acceleration of matrix multiply-accumulate operations
CN112130752A (en) * 2019-06-24 2020-12-25 英特尔公司 Shared local memory read merge and multicast return
CN112819681A (en) * 2019-11-15 2021-05-18 英特尔公司 Enhanced processor functionality for computing
CN114830082A (en) * 2019-11-15 2022-07-29 苹果公司 SIMD operand arrangement selected from multiple registers
CN116627504A (en) * 2019-11-15 2023-08-22 苹果公司 SIMD operand arrangement selected from a plurality of registers
CN113495865A (en) * 2020-03-20 2021-10-12 辉达公司 Asynchronous data movement pipeline
CN116185565A (en) * 2022-12-29 2023-05-30 芯动微电子科技(武汉)有限公司 Memory data isolation and sharing system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
雅可比迭代法在图形处理器上实现的研究;张健;涂永明;涂晓明;;计算机工程与应用(第34期);全文 *

Also Published As

Publication number Publication date
CN117132450A (en) 2023-11-28

Similar Documents

Publication Publication Date Title
US20240070226A1 (en) Accelerator for sparse-dense matrix multiplication
CN117132450B (en) Computing device capable of realizing data sharing and graphic processor
US20170322805A1 (en) Performing Rounding Operations Responsive To An Instruction
US11609762B2 (en) Systems and methods to load a tile register pair
US8386547B2 (en) Instruction and logic for performing range detection
JP4148560B2 (en) Floating point division arithmetic unit
CN107315717B (en) Device and method for executing vector four-rule operation
US20090049113A1 (en) Method and Apparatus for Implementing a Multiple Operand Vector Floating Point Summation to Scalar Function
US20160026607A1 (en) Parallelization of scalar operations by vector processors using data-indexed accumulators in vector register files, and related circuits, methods, and computer-readable media
US20110078225A1 (en) Extended-Precision Integer Arithmetic and Logical Instructions
EP3825842B1 (en) Data processing method and apparatus, and related product
EP4020169A1 (en) Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions
KR20180128075A (en) Per-shader preamble for graphics processing
US11823303B2 (en) Data processing method and apparatus
WO2018120767A1 (en) Data processing method and device
US11941395B2 (en) Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
JP3745673B2 (en) Processor
US10528322B2 (en) Unified multifunction circuitry
WO2018024094A1 (en) Operation device and method of operating same
CN116127261B (en) Matrix multiply-accumulate method and device in processor and electronic equipment
CN111381875B (en) Data comparator, data processing method, chip and electronic equipment
CN111382390B (en) Operation method, device and related product
US20220374207A1 (en) Applications of and techniques for quickly computing a modulo operation by a mersenne or a fermat number
CN111400341B (en) Scalar lookup instruction processing method and device and related product
JPH0435792B2 (en)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant