WO2023042989A1 - Procédé d'opération d'addition tenant compte d'une échelle de données, accélérateur matériel associé, et dispositif informatique l'utilisant - Google Patents

Procédé d'opération d'addition tenant compte d'une échelle de données, accélérateur matériel associé, et dispositif informatique l'utilisant Download PDF

Info

Publication number
WO2023042989A1
WO2023042989A1 PCT/KR2022/006216 KR2022006216W WO2023042989A1 WO 2023042989 A1 WO2023042989 A1 WO 2023042989A1 KR 2022006216 W KR2022006216 W KR 2022006216W WO 2023042989 A1 WO2023042989 A1 WO 2023042989A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
convolution
input
output
scale
Prior art date
Application number
PCT/KR2022/006216
Other languages
English (en)
Korean (ko)
Inventor
정태영
Original Assignee
오픈엣지테크놀로지 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 오픈엣지테크놀로지 주식회사 filed Critical 오픈엣지테크놀로지 주식회사
Publication of WO2023042989A1 publication Critical patent/WO2023042989A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/22Microcontrol or microprogram arrangements
    • G06F9/28Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • the present invention relates to a technique for performing an operation in a computing device, and more particularly, to an addition operation technique considering the scale of a number.
  • Signal processing technology used to implement artificial intelligence such as neural networks may be implemented as software or as a hardware accelerator for fast processing.
  • a neural network used for machine learning there are many layers that perform various calculations, and a lot of data can be calculated in each layer.
  • a problem may occur due to an environment in which the size of an internal memory or an internal buffer provided inside the hardware accelerator is limited. That is, if the size of one set of data to be operated is smaller than the size of the internal memory or internal buffer, the data of one set is divided into two subsets, each is calculated separately, sub result values are calculated, and then the sub results are calculated. It is necessary to go through the process of combining the resulting values again.
  • the sub-result values go through a process of being stored in an internal memory or an internal buffer and then read again, and in this process, there is a problem that unwanted quantization errors of data may occur.
  • This quantization error may be an error that does not occur unless one set of data is divided into two subsets and separately calculated.
  • Prior art related to quantization of data in neural network technology includes Korean Patent Application Nos. 1020217011986, 1020200110330, 1020170150707, 1020200082108, and 1020207038081.
  • FIGS. 1 to 5 These contents are prior knowledge known to the inventor of the present invention, and at least some of them may be contents that have not been disclosed to unspecified persons at the time of filing the present patent application.
  • the present invention uses the concept of a scale of numbers or data used by a computing device.
  • Computing devices express numbers in the form of N-bit numbers using binary numbers.
  • the N-bit number includes a most significant bit (MSB) and a least significant bit (LSB).
  • MSB most significant bit
  • LSB least significant bit
  • the scale of the N-bit number may be defined as the size of a number represented by the LSB of the N-bit number. It can be defined by the minimum value other than 0 (zero) that can be represented by the N-bit number.
  • two decimal numbers '128' and '1' each represented by 2 bits may be considered.
  • the decimal number '128' may be expressed as '01' according to the binary notation
  • the decimal number '1' may be expressed as '01' according to the binary notation.
  • the first scale which is a scale of a 2-bit number representing the decimal number '128'
  • the second scale which is a scale of a 2-bit number representing the decimal number '1'
  • the first scale is 128 times larger than the second scale.
  • FIG. 1A illustrates a configuration of an input activation 710, which is one of objects for mathematical operations according to the present invention.
  • input activation may also be referred to as first input data.
  • the input activation 710 may be a three-dimensional array consisting of a first dimension, a second dimension, and a third dimension.
  • the first dimension, the second dimension, and the third dimension of the input activation 710 may be referred to as an input channel dimension, a height dimension, and a width dimension, respectively.
  • the input activation 710 shown in FIG. 1A is an example in which the size ci of the first dimension, the size h of the second dimension, and the size w of the third dimension are 3, 2, and 4, respectively.
  • the data size of the input activation 710 shown in FIG. 1A is proportional to ci*h*w.
  • sc_ai1 may be the decimal number 1 or the decimal number 128.
  • 1B shows the configuration of a weight 740, which is another one of the objects of mathematical operation according to the present invention.
  • the weight 740 may also be referred to as second input data.
  • the weight 740 may be a 4-dimensional array consisting of a first dimension, a second dimension, a third dimension, and a fourth dimension.
  • the first dimension, the second dimension, the third dimension, and the fourth dimension of the weight 740 may be referred to as an output channel dimension, an input channel dimension, a height dimension, and a width dimension, respectively.
  • the weight 740 shown in FIG. 1B is 2, 3, 2, and 2 in the first dimension co, the second dimension ci, the third dimension r, and the fourth dimension s, respectively. Yes.
  • the data size of the weight 740 presented in FIG. 1B is proportional to co*ci*r*s.
  • the first scale sc_w1 and the second scale sc_w2 are values that can be set independently of each other.
  • the first scale sc_w1 may be proportional to the decimal number 1
  • the second scale sc_w2 may be proportional to the decimal number 128.
  • 1C shows an example in which the input activation 710 includes six input channels 711 to 716.
  • FIG. 1D shows an example in which the weight 740 is composed of two output channels 741 and 742, and each output channel is composed of 6 input channels (ex: 7411 to 7416).
  • 2A to 2C are conceptual diagrams illustrating a convolution operation between the input activation 710 and the weight 740 .
  • the circular symbol surrounding the letter 'x' includes a first mathematical operation object disposed to the left of the symbol and a second mathematical operation object disposed to the right of the symbol. It is a symbol representing the convolution operation between
  • an output activation 750 may be generated by performing a convolution operation on the input activation 710 and the weight 740 .
  • output activation may also be referred to as output data.
  • the output activation 750 may be a three-dimensional array consisting of a first dimension, a second dimension, and a third dimension.
  • the first dimension, the second dimension, and the third dimension of the output activation 750 may be referred to as an output channel dimension, a height dimension, and a width dimension, respectively.
  • the output activation 750 shown in FIGS. 2A to 2C is an example in which the size co of the first dimension, the size ho of the second dimension, and the size wo of the third dimension are 2, 2, and 3, respectively.
  • the data size of the output activation 750 shown in FIGS. 2A to 2C is proportional to co*ho*wo.
  • 3A shows the main structure of some of the computing devices used in an embodiment of the present invention.
  • the computing device 1 includes a dynamic random access memory (DRAM) 130, a hardware accelerator 110, a bus 700 connecting the DRAM 130 and the hardware accelerator 110, and other devices connected to the bus 700.
  • DRAM dynamic random access memory
  • the DRAM 130 may be referred to as a memory 130 .
  • the computing device 1 may further include a power supply unit, a communication unit, a user interface, a storage unit 170, and peripheral units not shown.
  • the bus 700 may be shared by the hardware accelerator 110, other hardware 99, and the main processor 160.
  • the hardware accelerator 110 includes a DMA unit (Direct Memory Access part) 20, a control unit 40, an internal memory 30, an input buffer 650, a data operation unit 610, and an output buffer 640 can do.
  • DMA unit Direct Memory Access part
  • Some or all of data temporarily stored in the internal memory 30 may be provided from the DRAM 130 through the bus 700 .
  • the controller 40 and the DMA unit 20 may control the internal memory 30 and the DRAM 130 to move data stored in the DRAM 130 to the internal memory 30 .
  • Data stored in the internal memory 30 may be provided to the data calculator 610 through the input buffer 650 .
  • Output values generated by the operation of the data calculator 610 may be stored in the internal memory 30 via the output buffer 640 .
  • the output values stored in the internal memory 30 may be written to the DRAM 130 under the control of the control unit 40 and the DMA unit 20 .
  • the control unit 40 may collectively control the operations of the DMA unit 20, the internal memory 30, and the data operation unit 610.
  • the data calculator 610 may perform a first calculation function during a first time period and a second calculation function during a second time period.
  • one data calculation unit 610 is presented within the hardware accelerator 110 .
  • a plurality of data calculation units 610 shown in FIG. 3A may be provided in the hardware accelerator 110 to perform operations requested by the control unit 40 in parallel, respectively. there is.
  • the data operation unit 610 may sequentially output the output data according to a given order according to time rather than outputting them all at once.
  • the buffer may be part of the internal memory 30 shown in FIG. 3A.
  • a first storage space allocated for the input activation 710 may be defined, and a second storage space allocated for a weight may be defined.
  • the sizes of the first storage space and the second storage space may be limited.
  • the input activations 710 may be split for each input channel, and only the input activations 711 and 712, for example, may be stored in the first storage space and used.
  • the weight 740 is split for each input channel, so that, for example, the input activations 7411 and 7412 of the first output channel and the input activations 7421 and 7421 of the second output channel 7422) can be stored and used in the second storage space.
  • FIG. 5 illustrates a method of calculating the output activation 750 shown in FIG. 2(b) using the split data.
  • An object of the present invention is to provide a technique for reducing quantization errors generated in the process of processing or dividing data into two or more groups when calculating or processing data in a hardware accelerator.
  • An operation method provided according to an aspect of the present invention relates to a specific method of element-by-element addition operation (P101, P102) to improve the above-described quantization error.
  • a computing device performs a convolution on first input data 710 and second input data 740 for each input channel to obtain a set of convolutional data (7012, 7034). , 7056 or 7511 to 7513);
  • the computing device sets scales (sc_co_ci1,2, sc_co_ci3,4, and sc_co_ci5,6 representing each convolution data based on statistical values of values constituting each convolution data of the set of convolution data).
  • the computing device performs an addition operation on first convolution data 7012 represented by a first scale and second convolution data 7034 represented by a second scale among the set of convolution data, generating intermediate data 750p; And the computing device, after generating the intermediate data, performs an addition operation on third convolution data 7056 expressed in a third scale among the set of convolution data and the intermediate data, , calculating the output data 750; may include.
  • the third scale may not be smaller than the first scale, and the third scale may not be smaller than the second scale.
  • the step of generating the one set of convolution data includes outputting one set of first split data 711 to 716 obtained by splitting the first input data for each input channel and one set of the second input data.
  • generating the set of convolutional data by convolving a set of second split data (7411 to 7416) obtained by splitting the channel 741 for each input channel; and calculating the output data may include performing an addition operation on the third convolution data 7056 and the intermediate data to calculate output data corresponding to the set of output channels of the second input data. steps may be included.
  • the set of output channels is any one specific output channel among a plurality of output channels constituting the second input data
  • the output data corresponding to the set of output channels of the second input data is the specific output channel. It may be output data corresponding to an output channel.
  • the step of performing convolution for each input channel to generate a set of convolution data includes performing convolution on the first split data set and the second split data set for each input channel to correspond to each input channel. generating a set of input channel convolution data (7501 to 7506, or 7511 to 7516) consisting of input channel convolution data of and generating the one set of convolution data by grouping the one set of input channel convolution data.
  • each of the convolution data is the same as one input channel convolution data among the set of input channel convolution data, or two or more input channels among the set of input channel convolution data. It may be calculated by performing an element-by-element addition operation on the convolution data.
  • the calculation method may include, in order to determine the group, the computing device calculates a range of values of elements constituting each of the second split data and determines a set of ranges (rg_w_co_ci1 to rg_w_co_ci6) ; and grouping, by the computing device, the one set of input channel convolution data based on the one set of ranges.
  • the calculation method may include, in order to determine the group, the computing device calculates a range of values of elements constituting each of the input channel convolution data and determines a set of ranges (rg_co_ci1 to rg_co_ci6) step; and grouping, by the computing device, the one set of input channel convolution data based on the one set of ranges.
  • the step of generating the one set of convolution data may include convolution of the first split data of the one set and the second split data of the one set for each input channel to obtain an input channel convolution corresponding to each input channel. and generating a set of input channel convolution data consisting of data, wherein each convolution data is identical to one input channel convolution data among the set of input channel convolution data.
  • the computing device includes the step of generating the set of convolution data, the step of determining, the step of generating the intermediate data, and the step of calculating the set of convolution data for all output channels included in the second input data.
  • the computing device may be configured to combine output data for each channel generated for each output channel included in the second input data to generate output data including all the output channels.
  • the first input data is an input activation
  • the second input data is a weight
  • the output data is an output activation
  • a dimension of the weight may be greater than a dimension of the input activation
  • the input activation includes a plurality of first input channel data, each of the first input channel data is a two-dimensional array, the weight includes a plurality of output channel data, and each of the output channel data is It includes a plurality of second input channel data, and each of the second input channel data may be a two-dimensional array.
  • a computing device having a hardware accelerator 110 provided according to one aspect of the present invention, wherein the hardware accelerator is adapted to obtain first input data and second input data, the first input data and the second input data
  • a set of convolution data is generated by convolving data for each input channel, and each convolution data is expressed based on a statistical value of values constituting each convolution data of the set convolution data.
  • output data may be designed to produce
  • the third scale may not be smaller than the first scale
  • the third scale may not be smaller than the second scale.
  • the step of generating the one set of convolution data includes the first split data obtained by splitting the first input data for each input channel and the one set of output channels of the second input data for each input channel.
  • generating a set of convolution data by convolving a set of second split data obtained by splitting for each input channel;
  • the calculating of the output data may include calculating output data corresponding to the set of output channels of the second input data by performing an addition operation on the third convolution data and the intermediate data;
  • the hardware accelerator 110 includes an internal memory 30, the size of the internal memory is smaller than the data size of all the second input data, and one set of output channels of the second input data may be greater than the size of split data obtained by splitting for each input channel.
  • the present invention when calculating or processing data in a hardware accelerator, it is possible to provide a technique for reducing quantization errors occurring in the process of dividing data into two or more groups and processing them.
  • FIG 1A shows the configuration of input activation, which is one of the subjects of mathematical operation according to the present invention.
  • 1B shows the configuration of a weight, which is another one of the objects of mathematical operation according to the present invention.
  • 1C shows an example in which input activation consists of 6 input channels.
  • 1D shows an example in which weights are composed of two output channels and each output channel is composed of six input channels.
  • 2A to 2C are conceptual diagrams illustrating a convolution operation between the input activation and the weight.
  • Figure 3a shows the main structure of some of the computing devices used in an embodiment of the present invention
  • Figures 3b to 3e is the size of the storage space for storing a mathematical operation target for convolution operation and the mathematical operation target size comparison.
  • FIG. 5 illustrates a method of calculating the output activation shown in FIG. 2(b) using the split data.
  • 6A to 6C are flowcharts illustrating a method of calculating output activations by performing a convolution operation on input activations and weights according to an embodiment of the present invention.
  • FIG. 7A to 7C illustrate a method of calculating output activation by performing a convolution operation on an input activation and a weight according to an embodiment of the present invention.
  • FIG. 8 is a flowchart illustrating a method of generating output data by performing an operation on two input data according to an embodiment of the present invention.
  • 9A illustrates a convolution operation process between an input activation composed of 6 input channels and a first output channel of a weight composed of 2 output channels.
  • 9B illustrates a convolution operation process between the input activation and the second output channel.
  • FIG. 10 illustrates an embodiment of a specific method for determining input channels to belong to a specific group shown in FIG. 9A.
  • FIG. 11 shows another embodiment of a specific method of determining input channels belonging to a specific group shown in FIG. 9A.
  • FIG. 13 illustrates an embodiment of a specific method for determining input channels to belong to a specific group shown in FIG. 12 .
  • FIG. 14 shows another embodiment of a specific method for determining input channels belonging to a specific group shown in FIG. 12 .
  • 15 is a flowchart illustrating a calculation method provided according to an embodiment of the present invention.
  • FIGS. 6a, 6b, and 6c may collectively be referred to as FIG. 6 .
  • 7a, 7b, and 7c may collectively be referred to as FIG. 7 .
  • FIG. 6 is a flowchart illustrating a method of calculating output activations by performing a convolution operation on input activations and weights according to an embodiment of the present invention.
  • the input activation may be 3D data having the same structure as the input activation 710 illustrated in FIG. 1A
  • the weight may be 4D data having the same structure as the weight 740 illustrated in FIG. 1 .
  • FIG. 7 illustrates a method of calculating output activations by performing a convolution operation on input activations and weights according to an embodiment of the present invention.
  • FIGS. 6A, 6B, and 6C correspond to those presented in FIGS. 7A, 7B, and 7C, respectively.
  • step S110 a set of first split data 711, 712, and 713 obtained by splitting the input activation 710 for each input channel may be obtained.
  • the first output channel 741 of the weight 740 is split for each input channel to obtain a set of second split data 7411, 7412, and 7413.
  • the set of first split data 711, 712, and 713 and the set of second split data 7411, 7412, and 7413 are convoluted for each input channel to obtain a set of convolution data 7511, 7512, 7513) can be created.
  • a scale representing each convolution data may be determined based on statistical values of values constituting each convolution data of the set of convolution data 7511, 7512, and 7513. .
  • the scale sc_co1_ci1 to be applied for the representation of the convolution data 7511 can be determined based on the distribution of values of 6 elements constituting the convolution data 7511 .
  • the scale sc_co1_ci2 to be applied for the representation of the convolutional data 7512 can be determined based on the distribution of the six elements constituting the convolutional data 7512 .
  • the scales applied to the convolution data 7511, the convolution data 7512, and the convolution data 7513 may be determined as sc_co1_ci1, sc_co1_ci2, and sc_co1_ci3, respectively.
  • sc_co1_ci1, sc_co1_ci2, and sc_co1_ci3 are values that can be independently determined. Accordingly, sc_co1_ci1, sc_co1_ci2, and sc_co1_ci3 may be the same or different.
  • step S130 specific expression values of the set of convolution data 7511, 7512, and 7513 may be determined according to the determined scales sc_co1_ci1, sc_co1_ci2, and sc_co1_ci3.
  • step S140 among the set of convolution data 7511, 7512, and 7513, first convolution data 7511 expressed as 'first scale (sc_co1_ci1)' and 'second scale (sc_co1_ci2)' Intermediate data 751p may be generated by performing an addition operation on the second convolutional data 7512 represented by .
  • step S150 the third convolution data 7513 expressed as 'third scale (sc_co1_ci3)' among the set of convolution data 7511, 7512, and 7513 and the intermediate data 751p
  • FIG 7A shows an example in which the third scale (sc_co1_ci3) is not smaller than the first scale (sc_co1_ci1) and the third scale (sc_co1_ci3) is not smaller than the second scale (sc_co1_ci2).
  • step S140 may be performed before step S150.
  • step S210 a set of first split data 711, 712, 713 obtained by splitting the input activation 710 for each input channel and the second output channel 742 of the weight 740 are converted into input channels.
  • a set of second split data 7421 , 7422 , and 7423 obtained by splitting each may be convoluted for each input channel to generate a set of convolutional data 7521 , 7522 , and 7523 .
  • a scale representing each convolution data may be determined based on statistical values of values constituting each convolution data of the set of convolution data 7521, 7522, and 7523. .
  • the scale sc_co2_ci1 to be applied for the representation of the convolution data 7521 can be determined based on the distribution of 6 elements constituting the convolution data 7521 .
  • the scale sc_co2_ci2 to be applied for the representation of the convolutional data 7522 can be determined based on the distribution of the six elements constituting the convolutional data 7522 .
  • the scales applied to the convolution data 7521, the convolution data 7522, and the convolution data 7523 may be determined as sc_co2_ci1, sc_co2_ci2, and sc_co2_ci3, respectively.
  • sc_co2_ci1, sc_co2_ci2, and sc_co2_ci3 are values that can be independently determined. Accordingly, sc_co2_ci1, sc_co2_ci2, and sc_co2_ci3 may be the same or different.
  • step S230 specific expression values of the set of convolution data 7521, 7522, and 7523 may be determined according to the determined scales sc_co2_ci1, sc_co2_ci2, and sc_co2_ci3.
  • Intermediate data 752p may be generated by performing an addition operation on the second convolutional data 7522 represented by .
  • step S250 the third convolution data 7523 expressed as 'third scale (sc_co2_ci3)' among the set of convolution data 7521, 7522, and 7523 and the intermediate data 752p
  • FIG 7B shows an example in which the third scale (sc_co2_ci3) is not smaller than the first scale (sc_co2_ci1) and the third scale (sc_co2_ci3) is not smaller than the second scale (sc_co2_ci2).
  • step S240 may be performed before step S250.
  • the first process provided according to an embodiment of the present invention may include steps S110, S120, S130, S140, and S150.
  • the second process provided according to an embodiment of the present invention may include steps S210, S220, S230, S240, and S250.
  • the third process provided according to an embodiment of the present invention may include the step S310.
  • the first process and the second process may be performed in parallel or sequentially with a precedence relationship.
  • the third process may be executed after both the first process and the second process are completed.
  • the above-described first process, second process, and third process may be executed by the main processing unit of the computing device.
  • the computing device reads command codes for execution of the first process, the second process, and the third process from storage and stores them in a volatile memory, and the main processing unit executes the command codes to execute the first process , the second process, and the third process can be executed.
  • the above-described buffer may be provided in a part of internal memory or volatile memory inside the main processing unit according to the command code.
  • the input activation 710 and the weight 740 may be stored in an internal memory of the main processing unit or a part of a volatile memory.
  • the above-described first process, second process, and third process may be executed by a dedicated hardware accelerator included in the computing device.
  • the computing device reads the instruction codes for execution of the first process, the second process, and the third process from storage and stores them in a volatile memory, and the main processing unit executes the instruction codes, so that the hardware
  • the accelerator may acquire the input activation 710 and the weight 740 from volatile memory or non-volatile memory.
  • the buffer may exist inside the hardware accelerator.
  • FIG. 8 is a flowchart illustrating a method of generating output data by performing an operation on two input data according to an embodiment of the present invention.
  • the method may include step S100 and step S200.
  • step S100 the computing device performs a predefined calculation process P10 for each output channel of the second input data having M output channels to generate output data for each channel for each output channel.
  • step S200 the computing device may generate output data by combining the M pieces of output data for each channel.
  • the calculation process (P10) may include step (S10), step (S20), step (S30), step (S40), and step (S50).
  • the computing device includes a set of first split data obtained by splitting the first input data for each input channel and a set of split data obtained by splitting a specific output channel of the second input data for each input channel.
  • a set of convolution data may be generated by convolving the second split data for each input channel.
  • the computing device may determine a scale representing each convolution data based on a statistical value of values constituting each convolution data of the set of convolution data.
  • step S30 the computing device may determine an expression value of each convolution data according to the determined scale.
  • step S40 the computing device performs an addition operation on first convolution data expressed in a first scale and second convolution data expressed in a second scale among the set of convolution data, Intermediate data can be generated.
  • step S50 the computing device performs an addition operation on third convolution data expressed in a third scale among the set of convolution data and the intermediate data to obtain the specific output channel of the weight.
  • Output data for a specific output channel corresponding to can be calculated.
  • the third scale is not smaller than the first scale, and the third scale is not smaller than the second scale.
  • step S40 necessarily precedes step 250.
  • the first input data may be, for example, the input activation 710 described in FIG. 7 .
  • the second input data may be, for example, the weight 740 described in FIG. 7 .
  • FIG 7A shows an example in which one partial output activation is determined by one input channel.
  • FIGS. 9A and 9B show an example in which one partial output activation is determined by a plurality of input channels, that is, one input channel group.
  • FIGS. 9A and 9B can be usefully used when the number of input channels is large.
  • 9A illustrates a convolution operation process between an input activation 710 composed of six input channels and a first output channel of a weight composed of two output channels.
  • the example shown in FIG. 9A is different from the example shown in FIG. 7A in that the number of input channels constituting the input activation 710 is six. In FIG. 7A, the number of input channels is three.
  • the input channels constituting the input activation 710 are grouped.
  • each group may consist of one to a plurality of input channels.
  • Steps S110, S120, S130, S140, and S150 shown in FIG. 9A are steps S110, S120, S130, and S150 shown in FIG. 7A. (S140), and the same as step (S150).
  • step S110 a set of first split data 711 to 716 obtained by splitting the input activation 710 for each input channel may be obtained.
  • a set of second split data 7411 to 7416 may be obtained by splitting the first output channel 741 of the weight 740 for each input channel. Then, the set of first split data 711 to 716 and the set of second split data 7411 to 7416 are convolved for each input channel to obtain a set of input channel convolution data 7511 to 7516.
  • Input channel convolution data generated from input activation belonging to the xth group Gx is also considered to belong to the xth group Gx.
  • the input channel convolution data 7511 generated from the input activation 711 belonging to the first group G1 is also regarded as belonging to the first group G1.
  • step S115 convolutional data of a specific group is generated by performing an element-by-element addition operation on the plurality of input channel convolution data belonging to the specific group.
  • an element-by-element addition operation is performed on the plurality of input channel convolution data 7511 and 7512 belonging to the first group G1 to generate the first group convolution data 7112.
  • the second group of convolutional data 7134 and the third group of convolutional data 7156 are generated for the second group G2 and the third group G3, respectively.
  • the convolution data 7112 of the first group, the convolution data 7134 of the second group, and the convolution data 7156 of the third group may be respectively referred to as group-specific convolution data.
  • step S120 the convolution data 7112, 7134, and 7156 are expressed based on statistical values of values constituting the set of group-specific convolution data 7112, 7134, and 7156, respectively. scale can be determined.
  • the scale sc_co1_ci1,2 to be applied for the expression of the convolution data 7112 can be determined based on the distribution of values of the six elements constituting the first group of convolution data 7112. there is.
  • the scales applied to the first group of convolution data 7112, the second group of convolution data 7134, and the third group of convolution data 7156 are sc_co1_ci1,2, sc_co1_ci3,4, and It can be determined as sc_co1_ci5,6.
  • sc_co1_ci1,2, sc_co1_ci3,4, and sc_co1_ci5,6 are values that can be independently determined. Accordingly, sc_co1_ci1,2, sc_co1_ci3,4, and sc_co1_ci5,6 may be the same or different.
  • step S130 specific expression values of the set of convolution data 7112, 7134, and 7156 for each group may be determined according to the determined scales sc_co1_ci1,2, sc_co1_ci3,4, and sc_co1_ci5,6.
  • step S140 the first group of convolution data 7112 represented by the 'first scale (sc_co1_ci1,2)' and the 'th Intermediate data 751p may be generated by performing an addition operation on the convolutional data 7134 of the second group expressed as '2 scale(sc_co1_ci3,4)'.
  • the third scale (sc_co1_ci5,6) is not smaller than the first scale (sc_co1_ci1,2), and the third scale (sc_co1_ci5,6) is not smaller than the second scale (sc_co1_ci3,4). It shows an example that is not.
  • FIG. 9A shows a modified example by applying the concept of grouping input channels to the method described in FIG. 7A.
  • 9B illustrates a convolution operation process between the input activation 710 and the second output channel.
  • Steps indicated by reference numerals S210, S215, S220, S230, S240, and S250 in FIG. 9B correspond to steps indicated by reference numerals S110, S115, S120, S130, S140, and S150 in FIG. 9A, respectively.
  • Components indicated by reference numerals 7421 to 7426, 7521 to 7526, 7212, 7234, 7256, 752p, and 752 in FIG. 9B are reference numerals 7411 to 7416, 7511 to 7516, 7112, 7134, 7156, 751p, and components indicated by 751.
  • Components indicated by reference numerals sc_co2_ci1,2, sc_co2_ci3,4, and sc_co2_ci5,6 in FIG. 9B correspond to components indicated by reference numerals sc_co1_ci1,2, sc_co1_ci3,4, and sc_co1_ci5,6 in FIG. 9A, respectively.
  • FIG. 10 illustrates an embodiment of a specific method for determining input channels to belong to a specific group shown in FIG. 9A.
  • the computing device may calculate statistical values of elements constituting the corresponding input channel convolution data.
  • the first range (rg_co1_ci1) of the first input channel convolution data 7511 may be determined based on the minimum and maximum values of the six elements constituting the first input channel convolution data 7511 . For example, if the minimum value of the first input channel convolution data 7511 is 1 and the maximum value is 5, the first range rg_co1_ci1 is 4, which is the difference between the maximum value and minimum value, or 1, which is the minimum value, or the first range rg_co1_ci1. It can be a maximum of 5.
  • a second range (rg_co1_ci2), a third range (rg_co1_ci3), a fourth range (rg_co1_ci4), a fifth range (rg_co1_ci5), and a sixth range (rg_co1_ci6) may be determined. .
  • the computing device includes a set of input channel convolution data 7511 to 7516 based on the values of the ranges rg_co1_ci1 to rg_co1_ci6 or input channels 711 to 716 constituting the input activation 710 can be grouped.
  • the first range (rg_co1_ci1), the second range (rg_co1_ci2), the third range (rg_co1_ci3), and the fourth range (rg_co1_ci4) are 4, 5, 400, and 500, respectively
  • the second range (rg_co1_ci2) may be grouped into the first group, and the third range (rg_co1_ci3) and the fourth range (rg_co1_ci4) may be grouped into the second group.
  • the first group G1, the second group G2, and the third group G3 shown in FIG. 9 may be determined through this process.
  • FIG. 11 shows another embodiment of a specific method of determining input channels belonging to a specific group shown in FIG. 9A.
  • FIG. 11 is a modified embodiment from FIG. 10, and instead of a set of input channel convolution data 7511 to 7516 as a criterion for calculating the statistical value, a set of split data 7411 to 7416 presented in FIG. 9A is used.
  • a range (rg_w_co1_ci3), a fourth range (rg_w_co1_ci4), a fifth range (rg_w_co1_ci5), and a sixth range (rg_w_co1_ci6) may be determined.
  • the computing device may group the split data 7411 to 7416 or the input channels 711 to 716 constituting the input activation 710 based on the values of the ranges rg_w_co1_ci1 to rg_w_co1_ci6. there is.
  • the first group G1, the second group G2, and the third group G3 shown in FIG. 9A may be determined through this process.
  • FIGS. 10 and 11 can also be applied to FIG. 9B.
  • FIGS. 9A and 9B are a method of integrating and performing the methods presented in FIGS. 9A and 9B.
  • the 12 illustrates a convolution operation process between an input activation 710 composed of six input channels 711 to 716 and a weight composed of two output channels.
  • the first output channel includes 6 input channels 7411 to 7416
  • the second output channel includes 6 input channels 7421 to 7426.
  • the input channels constituting the input activation 710 are grouped.
  • a detailed method of determining input channels belonging to a specific group is as described above.
  • Steps S310, step 315, step S320, step S330, step S340, and step S350 shown in FIG. 12 are steps S110, step S115, and step S115 shown in FIG. 9A. It corresponds to (S120), step S130, step S140, and step S150.
  • a set of first split data 711 to 716 obtained by splitting the input activation 710 for each input channel may be obtained.
  • a set of second split data 7411 to 7416 and 7421 to 7426 may be obtained by splitting the output channels 741 and 742 of the weight 740 for each input channel.
  • the set of second split data 7411 to 7416 and 7421 to 7426 include split data 7411 to 7416 corresponding to the first output channels of the weights and split data 7421 to 7426 corresponding to the second output channels of the weights. ) is composed of
  • the set of first split data 711 to 716 and the set of second split data 7411 to 7416 and 7421 to 7426 are convolved for each input channel to obtain a set of input channel convolution data ( 7501 to 7506) can be created.
  • the input channel convolution data corresponding to each input channel is composed of convolution data corresponding to a first output channel of weights and convolution data corresponding to a second output channel of weights.
  • the input channel convolution data 7501 corresponding to the first input channel is composed of convolution data 7511 corresponding to the first output channel of weights and convolution data 7521 corresponding to the second output channel of weights.
  • convolution data 7511 is calculated by a convolution operation between split data 711 and split data 7411
  • convolution data 7521 is convolution between split data 711 and split data 7421. It is calculated by calculation.
  • Input channel convolution data generated from input activation belonging to the x group (Gx) is also considered to belong to the xth group (Gx).
  • step S315 convolutional data of a specific group is generated by performing an element-by-element addition operation on the plurality of input channel convolution data belonging to the specific group.
  • the first group of convolutional data 7012 may be generated by performing an element-by-element addition operation on the plurality of input channel convolutional data 7501 and 7502 belonging to the first group G1.
  • the first output channel data 7112 is calculated by performing an element-by-element addition operation between the input channel convolutional data 7511 and the input channel convolutional data 7512.
  • the second output channel data 7212 is calculated by performing an element-by-element addition operation between the convolutional data 7521 and the convolutional data 7522.
  • a scale representing the convolution data for each group may be determined based on statistical values of values constituting the convolution data 7012, 7034, and 7056 for each group.
  • the scale sc_co_ci1,2 to be applied for the expression of the first group of convolutional data 7012 can be determined based on the distribution of values of 12 elements constituting the first group of convolutional data 7012. can be easily understood.
  • the scales applied to the first group of convolution data 7012, the second group of convolution data 7034, and the third group of convolution data 7056 are sc_co_ci1,2, sc_co_ci3,4, and It can be determined as sc_co_ci5,6.
  • sc_co_ci1,2, sc_co_ci3,4, and sc_co_ci5,6 are values that can be independently determined.
  • step S340 among the set of convolution data 7012, 7034, and 7056, first convolution data 7012 expressed as 'first scale sc_co_ci1,2' and 'second scale sc_co_ci3' ,4)' to generate the intermediate data 750p by performing an addition operation on the second convolutional data 7034 .
  • the first output channel portion 751p is calculated by adding the first output channel portion of the first convolution data 7012 and the first output channel portion of the second convolution data 7034 for each element.
  • the second output channel portion 752p of the intermediate data 750p is obtained by adding the second output channel portion of the first convolution data 7012 and the second output channel portion of the second convolution data 7034 for each element. can be derived.
  • step S350 the third convolution data 7056 expressed as 'third scale (sc_co_ci5,6)' and the intermediate data 750p among the set of convolution data 7012, 7034, and 7056
  • the output activation 750 may be calculated by performing an addition operation on .
  • the first output channel portion 751 of the output activation 750 is calculated by adding the first output channel portion 751p of the intermediate data 750p and the first output channel portion of the third convolution data 7056 for each element. It can be.
  • the second output channel portion 752 of the output activation 750 converts the second output channel portion 752p of the intermediate data 750p and the second output channel portion of the third convolution data 7056 for each element. can be calculated in addition
  • the third scale (sc_co_ci5,6) is not smaller than the first scale (sc_co_ci1,2), and the third scale (sc_co_ci5,6) is not smaller than the second scale (sc_co0_ci3,4). It shows an example that is not.
  • step S340 may be performed before step S350.
  • FIG. 13 illustrates an embodiment of a specific method for determining input channels to belong to a specific group shown in FIG. 12 .
  • the computing device may calculate statistical values of elements constituting the corresponding input channel convolution data. Based on the statistical value, ranges rg_co_ci1 to rg_co_ci6 may be determined for each of the input channel convolution data 7501 to 7506 .
  • the computing device includes a set of input channel convolution data 7501 to 7506 based on values of the ranges rg_co_ci1 to rg_co_ci6 or input channels 711 to 716 constituting the input activation 710 can be grouped.
  • the first group G1, the second group G2, and the third group G3 shown in FIG. 12 may be determined through this process.
  • FIG. 14 shows another embodiment of a specific method for determining input channels belonging to a specific group shown in FIG. 12 .
  • the computing device includes a set of split data 7411 to 7416 and 7421 to 7426 based on values of the ranges rg_w_co_ci1 to rg_w_co_ci6 or input channels 711 to 716 constituting the input activation 710 can be grouped.
  • the first group G1, the second group G2, and the third group G3 shown in FIG. 12 may be determined through this process.
  • 15 is a flowchart illustrating a calculation method provided according to an embodiment of the present invention.
  • step S410 the computing device may generate a set of convolution data by convolving the first input data and the second input data 740 for each input channel.
  • the computing device may determine a scale representing each convolution data based on a statistical value of values constituting each convolution data of the set of convolution data.
  • step S430 the computing device performs an addition operation on first convolution data represented by a first scale and second convolution data represented by a second scale among the set of convolution data, Intermediate data can be generated.
  • step S440 after the step of generating the intermediate data, the computing device performs an addition operation on third convolution data expressed in a third scale among the set of convolution data and the intermediate data By doing so, output data can be calculated.
  • the third scale is not smaller than the first scale, and the third scale is not smaller than the second scale.
  • the first input data is the input data 711 to 716 shown in FIG. 12
  • the second input data is the data 7411 to 7416, 7421 to 7426 shown in FIG.
  • a set of convolutional data may be data 7012, 7034, and 7056 shown in FIG. 12 .
  • the scale representing each convolution data may be the scales (sc_co_ci1,2, sc_co_ci3,4, and sc_co_ci5,6) shown in FIG. 12 .
  • the first convolution data, the second convolution data, and the intermediate data may be data 7012 , 7034 , and 750p shown in FIG. 12 , respectively.
  • the third convolution data and the output data may be data 7056 and 750 shown in FIG. 12 , respectively.
  • the first input data is the input data 711 to 716 shown in FIG. 9A
  • the second input data is the data 7411 to 7416 shown in FIG. 9A
  • the solution data may be data 7112, 7134, and 7156 shown in FIG. 9A.
  • the scale representing each of the convolution data may be the scales (sc_co1_ci1,2, sc_co1_ci3,4, and sc_c1o_ci5,6) presented in FIG. 9A.
  • the first convolution data, the second convolution data, and the intermediate data may be data 7112, 7134, and 751p shown in FIG. 7A, respectively.
  • the third convolution data and the output data may be data 7156 and 751 presented in FIG. 9A, respectively.
  • the first input data is the input data 711 to 713 shown in FIG. 7A
  • the second input data is the data 7411 to 7413 shown in FIG. 7A
  • the solution data may be data 7511 to 7513 presented in FIG. 7A.
  • the scale representing each convolution data may be the scale (sc_co1_ci1 to sc_co1_ci3) presented in FIG. 7A.
  • the first convolution data, the second convolution data, and the intermediate data may be data 7511, 7512, and 751p presented in FIG. 7A, respectively.
  • the third convolution data and the output data may be data 7513 and 751 presented in FIG. 7A, respectively.
  • the calculating of the output data may include calculating output data corresponding to the output channel of the one set of the second input data by performing an addition operation on the third convolution data and the intermediate data ( S412) may be included.
  • the first split data set, the second split data set, and the output data corresponding to the output channel set of the second input data are respectively shown in FIG. 12. It may be data 711 to 716 , data 7411 to 7416 , 7421 to 7426 , and data 750 .
  • the first split data set, the second split data set, and the output data corresponding to the output channel set of the second input data are respectively shown in FIG. 9A. It may be data 711 to 716 , data 7411 to 7416 , and data 751 .
  • the first split data set, the second split data set, and the output data corresponding to the output channel set of the second input data are respectively shown in FIG. 7A. It may be data 711 to 713 , data 7411 to 7413 , and data 751 .
  • the set of output channels is any one specific output channel among a plurality of output channels constituting the second input data
  • the output data corresponding to the set of output channels of the second input data is the specific output channel. It may be output data corresponding to an output channel.
  • the first split data of the one set and the second split data of the one set are convolved for each input channel to obtain a set of input channel convolution data corresponding to each input channel.
  • the set of input channel convolution data may be data 7501 to 7506 shown in FIG. 12 .
  • the set of input channel convolution data may be data 7511 to 7516 shown in FIG. 9A.
  • each of the convolution data is the same as one input channel convolution data among the set of input channel convolution data, or two or more input channels among the set of input channel convolution data. It may be calculated by performing an element-by-element addition operation on the convolution data.
  • the group calculating, by the computing device, a range of values of elements constituting each of the second split data to determine a set of ranges (rg_w_co_ci1 to rg_w_co_ci6); and grouping, by the computing device, the one set of input channel convolution data based on the one set of ranges.
  • the set of ranges may be the ranges (rg_w_co1_ci1 to rg_w_co1_ci6) shown in FIG. 11 or the ranges (rg_w_co_ci1 to rg_w_co_ci6) shown in FIG. 14 .
  • determining a set of ranges (rg_co_ci1 to rg_co_ci6) by calculating, by the computing device, a range of values of elements constituting each of the input channel convolution data to determine the group; and grouping, by the computing device, the one set of input channel convolution data based on the one set of ranges.
  • the set of ranges may be the ranges (rg_co1_ci1 to rg_co1_ci6) shown in FIG. 10 or the ranges (rg_co_ci1 to rg_co_ci6) shown in FIG. 13 .
  • the step of generating the one set of convolution data may include convolution of the first split data of the one set and the second split data of the one set for each input channel to obtain an input channel convolution corresponding to each input channel. It may include generating a set of input channel convolution data consisting of data. Further, each of the convolution data may be the same as one input channel convolution data among the set of input channel convolution data.
  • the computing device includes the step of generating the set of convolution data, the step of determining, the step of generating the intermediate data, and the step of calculating the set of convolution data for all output channels included in the second input data.
  • the computing device may be configured to combine output data for each channel generated for each output channel included in the second input data to generate output data including all the output channels.
  • the first input data is an input activation
  • the second input data is a weight
  • the output data is an output activation
  • a dimension of the weight may be greater than a dimension of the input activation
  • the input activation includes a plurality of first input channel data, each of the first input channel data is a two-dimensional array, the weight includes a plurality of output channel data, and each of the output channel data is It includes a plurality of second input channel data, and each of the second input channel data may be a two-dimensional array.
  • the present invention is a combination of next-generation intelligent semiconductor technology development (design)-artificial intelligence processor business, which is a research project supported by Open Edge Technology Co., Ltd. (project performing organization) and the Ministry of Science and ICT and the National Research Foundation of Korea Information and Communication Planning and Evaluation Institute. It was developed in the process of carrying out the research project development of a sensory-based context predictive mobile artificial intelligence processor (task number 2020001310, task number 2020-0-01310, research period 2020.04.01 ⁇ 2024.12.31).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Software Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Optimization (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)

Abstract

Est divulgué un procédé d'opération comprenant les étapes consistant : à générer un ensemble d'éléments de données de convolution en convoluant, pour chaque canal d'entrée, un ensemble de premières données fractionnées obtenues en fractionnant des premières données d'entrée pour chaque canal d'entrée et un ensemble de deuxièmes données fractionnées obtenues en fractionnant un canal de sortie particulier de deuxièmes données d'entrée pour chaque canal d'entrée ; à déterminer une échelle représentant chaque élément de données de convolution sur la base d'une valeur statistique de valeurs configurant chaque élément de données de convolution de l'ensemble d'éléments de données de convolution ; à générer des données intermédiaires en effectuant une opération d'addition de premières données de convolution représentées dans une première échelle et de deuxièmes données de convolution représentées dans une deuxième échelle parmi l'ensemble d'éléments de données de convolution ; et après l'étape consistant à générer les données intermédiaires, à effectuer, par un dispositif informatique, une opération d'addition des données intermédiaires et de troisièmes données de convolution représentées dans une troisième échelle parmi le premier ensemble d'éléments de données de convolution pour calculer des données de sortie spécifiques à un canal pour un canal de sortie particulier correspondant au canal de sortie particulier des deuxièmes données d'entrée. La troisième échelle n'est pas inférieure à la première échelle, et la troisième échelle n'est pas inférieure à la deuxième échelle.
PCT/KR2022/006216 2021-09-16 2022-04-29 Procédé d'opération d'addition tenant compte d'une échelle de données, accélérateur matériel associé, et dispositif informatique l'utilisant WO2023042989A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2021-0124097 2021-09-16
KR1020210124097A KR102395744B1 (ko) 2021-09-16 2021-09-16 데이터 스케일을 고려한 덧셈 연산 방법 및 이를 위한 하드웨어 가속기, 이를 이용한 컴퓨팅 장치

Publications (1)

Publication Number Publication Date
WO2023042989A1 true WO2023042989A1 (fr) 2023-03-23

Family

ID=81582619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/006216 WO2023042989A1 (fr) 2021-09-16 2022-04-29 Procédé d'opération d'addition tenant compte d'une échelle de données, accélérateur matériel associé, et dispositif informatique l'utilisant

Country Status (2)

Country Link
KR (1) KR102395744B1 (fr)
WO (1) WO2023042989A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20240008747A (ko) * 2022-07-12 2024-01-19 오픈엣지테크놀로지 주식회사 데이터 스케일을 고려한 콘볼루션 데이터의 양자화 방법, 이를 위한 하드웨어 가속기, 및 이를 이용한 컴퓨팅 장치

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010052899A (ko) * 1998-06-15 2001-06-25 테크니셰 유니베르시테트 드레스덴 데이터 연산 처리 장치
KR20190051697A (ko) * 2017-11-07 2019-05-15 삼성전자주식회사 뉴럴 네트워크의 디컨벌루션 연산을 수행하는 장치 및 방법
KR20190118365A (ko) * 2018-04-10 2019-10-18 한국항공대학교산학협력단 컨벌루션 신경망의 첫번째 레이어의 개선된 이진화 장치 및 방법
KR20200000480A (ko) * 2017-04-19 2020-01-02 상하이 캠브리콘 인포메이션 테크놀로지 컴퍼니 리미티드 처리 장치 및 처리 방법
KR20210099991A (ko) * 2020-02-05 2021-08-13 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. 딥 러닝 처리 장치, 방법, 기기 및 저장 매체

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010052899A (ko) * 1998-06-15 2001-06-25 테크니셰 유니베르시테트 드레스덴 데이터 연산 처리 장치
KR20200000480A (ko) * 2017-04-19 2020-01-02 상하이 캠브리콘 인포메이션 테크놀로지 컴퍼니 리미티드 처리 장치 및 처리 방법
KR20190051697A (ko) * 2017-11-07 2019-05-15 삼성전자주식회사 뉴럴 네트워크의 디컨벌루션 연산을 수행하는 장치 및 방법
KR20190118365A (ko) * 2018-04-10 2019-10-18 한국항공대학교산학협력단 컨벌루션 신경망의 첫번째 레이어의 개선된 이진화 장치 및 방법
KR20210099991A (ko) * 2020-02-05 2021-08-13 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. 딥 러닝 처리 장치, 방법, 기기 및 저장 매체

Also Published As

Publication number Publication date
KR102395744B1 (ko) 2022-05-09

Similar Documents

Publication Publication Date Title
WO2020235797A1 (fr) Appareil de traitement d'opération de multiplication modulaire et procédés associés
WO2020242057A1 (fr) Appareil de décompression et procédé de commande de celui-ci
WO2014171705A1 (fr) Procédé pour régler une zone d'affichage et dispositif électronique associé
WO2023042989A1 (fr) Procédé d'opération d'addition tenant compte d'une échelle de données, accélérateur matériel associé, et dispositif informatique l'utilisant
WO2020231049A1 (fr) Appareil de modèle de réseau neuronal et procédé de compression de modèle de réseau neuronal
WO2014035113A1 (fr) Procédé de commande d'une fonction de toucher et dispositif électronique associé
WO2016159518A1 (fr) Dispositif de calcul de la moyenne de données non linéaires
WO2019164251A1 (fr) Procédé de réalisation d'apprentissage d'un réseau neuronal profond et appareil associé
WO2017206867A1 (fr) Procédé et appareil d'arrêt de capteurs, support d'informations, et dispositif électronique
WO2018076453A1 (fr) Procédé d'affichage d'application associée, dispositif et terminal mobile
WO2021125496A1 (fr) Dispositif électronique et son procédé de commande
WO2011105879A2 (fr) Filtre numérique pouvant être reconfiguré en fréquence et égaliseur utilisant celui-ci
WO2023229094A1 (fr) Procédé et appareil pour la prédiction d'actions
WO2020246848A1 (fr) Dispositif et procédé de tri d'un texte chiffré approximativement chiffré
EP3659073A1 (fr) Appareil électronique et procédé de commande associé
WO2024014631A1 (fr) Procédé de quantification pour données de convolution prenant en compte une échelle de données, accélérateur matériel associé et appareil informatique l'utilisant
WO2024106556A1 (fr) Procédé et dispositif de compression de données à virgule flottante
WO2024048868A1 (fr) Procédé de calcul dans un réseau neuronal et dispositif associé
WO2021158040A1 (fr) Dispositif électronique fournissant un énoncé correspondant au contexte d'une conversation, et procédé d'utilisation associé
WO2024005590A1 (fr) Dispositif de mise à l'échelle d'image et procédé de mise à l'échelle d'image
WO2023043108A1 (fr) Procédé et appareil permettant d'améliorer la précision efficace d'un réseau neuronal par extension d'architecture
WO2022045448A1 (fr) Procédé de compression de données de sortie d'un accélérateur matériel, procédé de décodage de données entrées dans un accélérateur matériel et accélérateur matériel associé
WO2022097954A1 (fr) Procédé de calcul de réseau neuronal et procédé de production de pondération de réseau neuronal
WO2019132235A1 (fr) Appareil de mémoire et son procédé de traitement de données
WO2022114451A1 (fr) Procédé de formation de réseau de neurones artificiel et procédé d'évaluation de la prononciation l'utilisant

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22870079

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE