CN115310035A

CN115310035A - Data processing method, data processing device, electronic equipment, medium and chip

Info

Publication number: CN115310035A
Application number: CN202210945376.4A
Authority: CN
Inventors: 陈庆澍; 王勇; 欧阳剑; 邰秀瑢; 王京
Original assignee: Kunlun Core Beijing Technology Co ltd
Current assignee: Kunlun Core Beijing Technology Co ltd
Priority date: 2022-08-08
Filing date: 2022-08-08
Publication date: 2022-11-08

Abstract

The disclosure provides a data processing method, a data processing device, an electronic device, a medium and a chip, and relates to the technical field of artificial intelligence, in particular to the technical field of artificial intelligence chips. The implementation scheme is as follows: acquiring a first matrix and a second matrix; determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1; mapping each element in the first matrix to [0,2 ] ⁿ ]A section to convert the element into a corresponding conversion element; determining a second element with the largest absolute value in the second matrix, and recording the largest absolute value of the second element as a second largest value Max2; mapping each element in the second matrix to [0,2 ] ⁿ ]An interval to convert the element into a corresponding conversion element; and calculating the product of the first matrix and the second matrix based on the conversion element corresponding to each element in the first matrix and the second matrix.

Description

Data processing method, data processing device, electronic equipment, medium and chip

Technical Field

The present disclosure relates to the field of artificial intelligence technology, and in particular, to the field of artificial intelligence chip technology, and in particular, to a data processing method and apparatus, an electronic device, a computer-readable storage medium, and a computer program product.

Background

Artificial intelligence is the subject of research that makes computers simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

Artificial intelligence models exist with a large number of computationally intensive operators, mainly including matrix multiplication, convolution, pooling, activation, and so on. These computations are very time-consuming, and the computation power of the conventional CPU hardly meets the requirements in terms of performance, so heterogeneous computations become the mainstream, and various artificial intelligence processors including GPUs, FPGAs, and ASICs are largely applied to the artificial intelligence model computation. Meanwhile, the selection of the data type plays an important role in the precision, performance and the like of the artificial intelligent calculation.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device, a computer-readable storage medium, a computer program product, and a chip.

According to an aspect of the present disclosure, there is provided a data processing method including: obtaining a first matrix and a second matrix, wherein the first matrixAnd each element of the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1; mapping each element in the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]An interval to convert the element to a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits; determining a second element with the largest absolute value in the second matrix, and recording the largest absolute value of the second element as a second largest value Max2; mapping each element in the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]An interval to convert the element into a corresponding conversion element; and calculating a product of the first matrix and the second matrix into a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix.

According to another aspect of the present disclosure, there is provided a data processing apparatus including: an obtaining module configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; a first determining module configured to determine a first element with a largest absolute value in the first matrix, and record the largest absolute value of the first element as a first largest value Max1; a first mapping module configured to map each element of the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]Interval for converting the element into corresponding conversion element, whereinThe conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits; a second determining module configured to determine a second element with a largest absolute value in the second matrix, and record a largest absolute value of the second element as a second largest value Max2; a second mapping module configured to map each element of the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]An interval to convert the element into a corresponding conversion element; and a calculation module configured to calculate a third matrix as a product of the first matrix and the second matrix based on a conversion element corresponding to each of the elements of the first matrix and the conversion matrix.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above method.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the above method.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program, wherein the computer program realizes the above-mentioned method when executed by a processor.

According to another aspect of the present disclosure, there is provided an electronic circuit comprising: circuitry configured to perform the above-described method.

According to one or more embodiments of the present disclosure, a data processing method is provided, which uses a data format having more mantissa bits than conventional floating-point data, so as to improve the precision of computation and further improve the accuracy of artificial intelligence model training and reasoning. Meanwhile, original floating point type data are mapped to all mantissa bits, floating point number calculation is converted into fixed point calculation without an exponent, so that calculation difficulty is reduced, calculation efficiency is improved, and hardware resources for calculation are saved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure;

fig. 2 shows a flow diagram of a method of converting elements in a matrix into conversion elements according to an embodiment of the disclosure.

FIG. 3 shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure; and

FIG. 4 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", and the like to describe various elements is not intended to limit the positional relationship, the temporal relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

In the related art, in the calculation process of the artificial intelligence model, a standard IEEE float type is mainly used, and with the continuous development of the technology, some new calculation types are generated to replace the standard float, such as a bffloat 16, fp16, int16 and other new semi-precision calculation types. fp16 and bfoat 16 include sign bits, exponent bits, and mantissa bits, and have exponent bits, so that the range of values that can be represented is large, but the calculation is complicated due to the large number of exponent bits, and thus the calculation efficiency is low compared to int 16. int16 is a fixed-point calculation, the calculation efficiency is higher than fp 16/bfoat 16, but because there is no exponent bit, all the number range that can be represented is smaller than fp 16/bfoat 16, and in many artificial intelligence model training processes, the situation of non-convergence occurs, so the range of use is relatively limited.

In order to solve the above problems, the present disclosure provides a data processing method, which uses a data format having more mantissa bits than the conventional floating point type data, so as to improve the precision of calculation and further improve the accuracy of training and reasoning of the artificial intelligence model. Meanwhile, the original floating point type data is mapped to all mantissa bits, so that the floating point number is converted into fixed point calculation without an exponent, the calculation difficulty is reduced, the calculation efficiency is improved, and hardware resources for calculation are saved.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure. As shown in fig. 1, the data processing method 100 includes: step S101, a first matrix and a second matrix are obtained, wherein each element in the first matrix and the second matrix is stored in a first storage unit in a first data format, the first data format is a floating point type and has 1 sign bit and m mantissa bits, and m is an integer greater than 1; step S102, determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1; step S103, based on the first maximum value Max1, mapping each element in the first matrix to [0,2 ] ⁿ ]A section to convert the element to a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits.

Thus, each element stored in the first matrix in the first data format is converted into a conversion element stored in the second data format, respectively, through step S103, thereby representing each element in the first matrix with a more-bit mantissa bit.

Step S104, determining a second element with the largest absolute value in the second matrix, and recording the largest absolute value of the second element as a second largest value Max2; step S105, based on the second maximum value Max2, mapping each element in the second matrix to [0,2 ⁿ ]Interval to convert the element into a corresponding conversion element.

Thus, each element stored in the first data format in the second matrix is converted into a conversion element stored in the second data format, respectively, through step S105, thereby representing each element in the second matrix with a more-bit mantissa bit.

Step S106, calculating a product of the first matrix and the second matrix to be a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix.

The data distribution in the matrix is determined by determining the maximum value of the absolute value in the matrix, so that the value of each element is mapped to the mantissa bits of the corresponding conversion element, the floating-point type matrix calculation is converted into the fixed-point type matrix operation, the difficulty of the matrix calculation can be effectively reduced, and the hardware resource for calculation is saved. Therefore, the floating point type data in the first data format is converted into the second data format with more mantissa bits, so that the calculation precision can be improved, and the training and reasoning accuracy of the artificial intelligence model for performing matrix operation by adopting the method can be improved. Meanwhile, the original floating point type data are mapped to all mantissa bits of the conversion elements stored in the second data format, so that the floating point number matrix calculation is converted into fixed point calculation without an exponent.

According to some embodiments, the first data format and the second data format each have 16 bits. Therefore, by providing the second data format which has 16 bit data in total and 14 exponent bits in the data, the multiplication operation of the floating point type data in the traditional first data format such as fp16, bfoat 16 and the like is converted into fixed point calculation on 14 mantissa bits, so that the hardware implementation is simpler, the efficiency is higher, the efficiency can be close to the integer type int16 with the same number of bits, the use of hardware resources can be reduced compared with the calculation types such as fp16 and bfoat 16, and the peak performance of the artificial intelligence chip can be improved.

For convenience of description, the following description will be given taking an example in which the second data format contains 16-bit data and the mantissa bit number n is 14. However, it is understood that the present disclosure is not limited to 16-bit data conversion, and may also be used for conversion of 32-bit single-precision floating point numbers or 64-bit double-precision floating point numbers to convert floating point operations including exponent operations into fixed point operations not including exponents, thereby improving operation efficiency and saving hardware computing resources.

Fig. 2 shows a flow diagram of a method of converting elements in a matrix into conversion elements according to an embodiment of the disclosure. As shown in fig. 2, step S103 includes: step S201, based on the first maximum value, determining a first division point to divide the interval [0, max1' ]]Dividing the space into two subintervals; and S202, determining the exponent number of the conversion element corresponding to each element in the first matrix based on the subinterval where the element is positioned, and mapping the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to that element.

It will be appreciated that the distribution range of each element in the first matrix, i.e. [0, max1], may be determined by determining the maximum of the absolute values of the elements in the matrix. The distribution range is further segmented by determining the segmentation point, so that mapping calculation is respectively carried out according to the subintervals where each element falls, the distribution range of the data is divided more carefully, and higher calculation accuracy is obtained.

According to some embodiments, step S201 comprises: determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, each element in the first matrix is mapped to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. The distribution of elements in the first matrix may be spaced by [0, max1]]Is divided into 2 ¹⁴ Is divided byA portion of the position

Distribution interval [0, max1] as a division point]Divided into two sub-intervals

And

such that the data represented in the second data format has at least

The accuracy of (2). Compared with the first data format, the second data format has more mantissa bits, so that the second data format has higher precision and can meet the precision requirement of most artificial intelligence models.

According to some embodiments, step S202 comprises: for each element in the first matrix, determining an absolute value a of the element; in response to the element being in a subinterval

In the method, the exponent bit of the conversion element corresponding to the element is determined to be 0, and the element is mapped to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to the element as

Or in response to the element being in a subinterval

In the method, the exponent bit of the conversion element corresponding to the element is determined to be 1, and the element is mapped to [0,2 ] ⁿ ]To determine the mantissa bits of the element converted into the corresponding conversion element as

Further, by determining the subinterval where the element is located, the distribution range of the element can be determined more accurately, and thus higher calculation accuracy is obtained. When the element is determined to fall within

When the interval has smaller value-taking distribution, the interval will have

This interval is subdivided into 2 ¹⁴ Is prepared by

The precision in the interval can be improved to

While elements with larger numbers fall within a larger range of sub-ranges

In yet have

The accuracy of (2).

It will be appreciated that the position of the set segmentation point may determine the maximum value corresponding to the element within each subinterval, and thus the accuracy of the element. Another embodiment for obtaining different precisions by setting different division points will be given below.

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

According to some embodiments, step S202 comprises: for each element in the first matrix, determining an absolute value a of the element; in response to the absolute value a being in a subinterval

Or in response to the absolute value a being in a subinterval

Taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, each element in the first matrix is mapped to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. When the first division point is determined to be larger than

Is/are as follows

Compared to determining the first division point as

Is in the section

The maximum value of the element in (1) is increased from Max1 to

The precision of the elements in the interval, namely the elements in the interval can be improved

The precision of the element in (1) is determined by

Is lifted to

In addition, it is possible to determine which interval the accuracy of data is to be improved, based on setting different values of i. For the artificial intelligence model with certain data distribution, the value i can be set according to the data distribution of the artificial intelligence model, so that the precision of the data in the target interval is improved.

It can be understood that the conversion process for each element in the second matrix is the same as the above-mentioned conversion process for the element in the first matrix, and data in the first data format is converted into data in the second data format in a mapping manner, which is not described herein again.

According to some embodiments, the method 100 of data processing further comprises: for each element in the first matrix and each element in the second matrix, determining a recovery factor corresponding to the element, where the recovery factors satisfy the following conditions: the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

It can be understood that the data processing method 100 is used for converting a conventional floating-point data type into a new data type with more mantissa bits, so as to convert a matrix calculation of a floating-point type into a fixed-point calculation, thereby reducing the complexity of the calculation and achieving the purpose of saving hardware resources. After conversion into fixed-point matrix calculation, the calculation result still needs to be converted into the original data type, so that the conversion process is unknown from the user perspective to improve the user experience.

The process of converting the matrix calculation into the original data type requires the above-mentioned recovery factor to be implemented. Specifically, when converting an element of a first data format to a conversion element of a second data format, the element is mapped to [0,2 ] by multiplying the absolute value of the element by a factor ⁿ ]The conversion of the matrix calculation into the original data type can be achieved by multiplying the calculation by the inverse of this factor, which is the recovery factor. The recovery factor is determined by satisfying the following condition: the conversion of the matrix calculation result into the original data type is achieved by multiplying the conversion element corresponding to the element by the recovery factor corresponding to the element, which is equal to the absolute value of the element.

According to some embodiments, an absolute value of each element in the third matrix is equal to a mantissa bit of a conversion element of a third element corresponding to the element in the first matrix multiplied by a mantissa bit of a conversion element of a fourth element corresponding to the element in the second matrix multiplied by a recovery factor corresponding to the third element multiplied by a recovery factor corresponding to the fourth element, and a sign bit of the element in the third matrix is an exclusive or value of a sign bit of the third element and a sign bit of the fourth element.

Therefore, matrix calculation of floating point numbers is converted into fixed point calculation of more mantissa numbers, calculation complexity is reduced, and calculation efficiency is improved. Due to the increase of the mantissa bits, the precision of the data is also improved. Meanwhile, through the process, automatic conversion of the data is realized, the calculation result can be automatically converted back to the original data type, a user cannot feel the data type and the data conversion process used in the calculation in use, and meanwhile, the calculation result and the calculation process with higher precision and higher efficiency can be obtained.

According to another aspect of the present disclosure, there is providedA data processing apparatus. As shown in fig. 3, the data processing apparatus 300 includes: an obtaining module 301 configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; a first determining module 302, configured to determine a first element with a largest absolute value in the first matrix, and record the largest absolute value of the first element as a first largest value Max1; a first mapping module 303 configured to map each element of the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]A section to convert the element to a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits; a second determining module 304, configured to determine a second element with the largest absolute value in the second matrix, and record the largest absolute value of the second element as a second largest value Max2; a second mapping module 305 configured to map each element of the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]An interval to convert the element into a corresponding conversion element; and a calculation module 306 configured to calculate a product of the first matrix and the second matrix as a third matrix based on the first conversion matrix and the second conversion matrix.

Thus, each element of the first matrix stored in the first data format is converted into a conversion element stored in the second data format by the first mapping module 303, respectively, such that each element of the first matrix is represented by a more bit mantissa bit. Each element of the second matrix stored in the first data format is converted to a converted element stored in the second data format by the second mapping module 306, respectively, such that each element of the second matrix is represented by a more-bit mantissa bit.

The maximum value of the absolute value in the matrix is determined by the first determining module 302 and the second determining module 304 to determine the data distribution in the matrix, so as to map the value of each element to the mantissa bits of the corresponding conversion element, and convert the matrix calculation of the floating point type into the matrix operation of the fixed point type, thereby effectively reducing the difficulty of the matrix calculation and saving the hardware resource for calculation. Therefore, the data processing device 300 can improve the precision of calculation by converting the floating point type data in the first data format into the second data format with more mantissa bits, and can improve the accuracy of training and reasoning of the artificial intelligence model for performing matrix operation by adopting the method.

Meanwhile, the first mapping module 303 and the second mapping module 305 implement conversion of floating-point number matrix calculation into fixed-point calculation that does not include an exponent by mapping the original floating-point type data onto all mantissa bits of the conversion element stored in the second data format, and the fixed-point calculation is less difficult than the exponent calculation, thereby improving the calculation efficiency and saving the hardware resources used for calculation.

It is to be understood that the data processing apparatus 300 provided by the present disclosure is not limited to be used for converting 16-bit data, and may also be used for converting 32-bit single-precision floating point numbers or 64-bit double-precision floating point numbers to convert floating point operations including exponent operations into fixed point operations not including exponents, thereby improving operation efficiency and saving hardware computing resources.

According to some embodiments, the first mapping module 303 comprises: a first determination unit configured to determine a first division point to divide an interval [0, max1] based on the first maximum value]Dividing the space into two subintervals; and a second determining unit configured to determine, for each element in the first matrix, an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and map the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to that element.

It is to be understood that the first determination unit may determine the distribution range of each element in the first matrix, i.e., [0, max1], by determining the maximum value of the absolute values of the elements in the matrix. The distribution range is further segmented by determining the segmentation point, so that mapping calculation is respectively carried out according to the subintervals where each element falls, the distribution range of the data is divided more carefully, and higher calculation accuracy is obtained.

According to some embodiments, the first determining unit is further configured to: determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, the first determination unit needs to map each element in the first matrix to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. The first determination unit may divide the distribution interval [0, max1] of the elements in the first matrix]Is divided into 2 ¹⁴ Serving at the position of the first portion

And

such that the data represented in the second data format has at least

According to some embodiments, the second determining unit comprises: a first determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element; a second determining subunit configured to respond to the element being located in the subinterval

Or a third determining subunit configured to respond to the element being located in a subinterval

Further, the subintervals where the elements are located can be determined by the second determining subunit and the third determining subunitThe distribution range of the elements is determined more accurately, thereby obtaining higher calculation accuracy. When the second determining subunit determines that the element falls into

When the interval has smaller value-taking distribution, the interval will have

This interval is subdivided into 2 ¹⁴ Is prepared by

The precision in the interval can be improved to

While elements with larger numbers fall within a larger range of sub-ranges

In yet have

The accuracy of (2).

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

According to some embodiments, the second determination unit comprises: a fourth determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element; a fifth determining subunit configured to respond to the absolute value a bitIn a sub-interval

Or a sixth determining subunit configured to determine whether the absolute value a is within the subinterval in response to the absolute value a being within the subinterval

Is/are as follows

Compared to determining the first division point as

Is in the section

The maximum value of the element in (1) is increased from Max1 to

The precision of the element in (1) is determined by

Is lifted to

It can be understood that the conversion process of the data processing apparatus 300 for each element in the second matrix is the same as the above-mentioned conversion process for the element in the first matrix, and the data in the first data format is converted into the data in the second data format in a mapping manner, which is not described herein again.

According to some embodiments, the data processing apparatus 300 further comprises: a third determining module configured to determine, for each element in the first matrix and each element in the second matrix, a restoration factor corresponding to the element, where the restoration factors satisfy the following conditions: the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

It is understood that the data processing apparatus 300 is used for converting a conventional floating-point data type into a new data type with more mantissa bits, so as to convert a floating-point matrix calculation into a fixed-point calculation, thereby reducing the complexity of the calculation and achieving the purpose of saving hardware resources. After conversion into fixed-point matrix calculation, the calculation result still needs to be converted into the original data type, so that the conversion process is unknown from the user perspective to improve the user experience.

The data processing apparatus 300 converts the matrix calculation result into the originalThe data type is determined by the recovery factor determined by the fifth determination module. Specifically, when converting an element of a first data format to a conversion element of a second data format, the element is mapped to [0,2 ] by the first mapping module 303 multiplying the absolute value of the element by a factor ⁿ ]The conversion of the matrix calculation into the original data type can be achieved by multiplying the calculation by the inverse of this factor, which is the recovery factor. The recovery factor is determined by satisfying the following condition: the conversion of the matrix calculation result into the original data type is achieved by multiplying the conversion element corresponding to the element by the recovery factor corresponding to the element, which is equal to the absolute value of the element.

Thus, the data processing device 300 converts the matrix calculation of the floating-point number into the fixed-point calculation of more mantissa bits, reduces the complexity of calculation, and improves the efficiency of calculation. Due to the increase of the mantissa bits, the precision of the data is also improved. Meanwhile, through the above process, the data processing apparatus 300 realizes automatic conversion of data, the calculation result can be automatically converted back to the original data type, the user cannot feel the data type and the data conversion process specifically used in the calculation in use, and meanwhile, the calculation result and the calculation process with higher precision and higher efficiency can be obtained.

According to another aspect of the present disclosure, there is also provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data processing method.

According to another aspect of the present disclosure, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute a data processing method.

According to another aspect of the present disclosure, there is also provided a computer program product comprising a computer program, wherein the computer program realizes the data processing method when executed by a processor.

According to another aspect of the present disclosure, there is also provided an electronic circuit comprising: a circuit configured to perform the data processing method, which electronic circuit may be implemented as a chip.

As shown in fig. 4, the electronic device 400 includes a computing unit 401 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

A number of components in the electronic device 400 are connected to the I/O interface 405, including: an input unit 406, an output unit 407, a storage unit 408, and a communication unit 409. The input unit 406 may be any type of device capable of inputting information to the electronic device 400, and the input unit 406 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a track pad, a track ball, a joystick, a microphone, and/or a remote controller. Output unit 407 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Storage unit 408 may include, but is not limited to, magnetic or optical disks. Communication unit 409 allows electronic device 400 to pass throughComputer networks such as the internet and/or various telecommunications networks exchange information/data with other devices and may include, but are not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth ^TM Devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

Computing unit 401 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 401 executes the respective methods and processes described above, such as the data processing method. For example, in some embodiments, the data processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 400 via the ROM 402 and/or the communication unit 409. When the computer program is loaded into RAM 403 and executed by computing unit 401, one or more steps of the data processing method described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the data processing method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

While embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely illustrative embodiments or examples and that the scope of the invention is not to be limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the present disclosure.

Claims

1. A method of data processing, comprising:

obtaining a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1;

determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1;

mapping each element in the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]A section to convert the element to a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits;

determining a second element with the largest absolute value in the second matrix, and recording the largest absolute value of the second element as a second largest value Max2;

mapping each element in the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]A section to convert the element into a corresponding conversion element; and

and calculating the product of the first matrix and the second matrix into a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix.

2. The method of claim 1, wherein the mapping of each element in the first matrix to [0,2 ] is based on the first maximum value Max1 ⁿ ]The interval to convert the element into a corresponding conversion element includes:

based on the first maximum, determining a first split point to split the interval [0, max1] into two sub-intervals; and

for each element in the first matrix, determining the exponent bit of the conversion element corresponding to the element based on the subinterval in which the element is located, and mapping the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to that element.

3. The method of claim 2, wherein the determining a partitioning point to partition an interval [0, max1] into two sub-intervals based on the first maximum value Max1 comprises:

determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

4. the method of claim 3, wherein for each element in the first matrix, determining an exponent bit of the conversion element to which the element corresponds based on the subinterval in which the element is located, and mapping the element to [ [ solution ] ], the element is mapped to0，2 ⁿ ]Determining the mantissa bits of the conversion element corresponding to the element comprises:

for each element of the first matrix,

determining the absolute value a of the element;

in response to the element being in the subinterval

Or

In response to the element being in a subinterval

5. The method of claim 2, wherein the determining a partitioning point to partition interval [0, max1] into two sub-intervals based on the first maximum value Max1 comprises:

determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

6. The method of claim 5, wherein for each element in the first matrix, determining an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and mapping the element to [0,2 ] ⁿ ]Determining the mantissa bits of the conversion element corresponding to the element comprises:

for each element of the first matrix,

determining the absolute value a of the element;

in response to the absolute value a being in a subinterval

Or

In response to the absolute value a being in a subinterval

7. The method of any of claims 1-6, further comprising:

for each element in the first matrix and each element in the second matrix, determining a recovery factor corresponding to the element, where the recovery factors satisfy the following conditions:

the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

8. The method according to claim 7, wherein the absolute value of each element in the third matrix is equal to the mantissa bit of the conversion element of the third element corresponding to the element in the first matrix multiplied by the mantissa bit of the conversion element of the fourth element corresponding to the element in the second matrix multiplied by the recovery factor corresponding to the third element multiplied by the recovery factor corresponding to the fourth element, and the sign bit of the element in the third matrix is the exclusive or value of the sign bit of the third element and the sign bit of the fourth element.

9. A data processing apparatus comprising:

an obtaining module configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1;

a first determining module configured to determine a first element with a maximum absolute value in the first matrix, and record the maximum absolute value of the first element as a first maximum value Max1;

a first mapping module configured to map each element of the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]A section to convert the element to a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits;

a second determining module configured to determine a second element with a largest absolute value in the second matrix, and record a largest absolute value of the second element as a second largest value Max2;

a second mapping module configured to map each element of the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]A section to convert the element into a corresponding conversion element; and

a calculation module configured to calculate a product of the first matrix and the second matrix as a third matrix based on a conversion element corresponding to each element of the first matrix and the second matrix.

10. The apparatus of claim 9, wherein the first mapping module comprises:

a first determination unit configured to determine a first division point to divide an interval [0, max1] into two subintervals based on the first maximum value; and

a second determination unit configured to determine, for each element in the first matrix, an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and map the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element to which the element corresponds.

11. The apparatus of claim 10, wherein the first determining unit is further configured to:

determining the first division point as the first maximum value Max1

To divide the interval [0, max1]]Divided into two sub-intervals

And

12. the apparatus of claim 11, wherein the second determining unit comprises:

a first determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element;

a second determining subunit configured to respond to the element being located in the subinterval

Or

A third determining subunit configured to respond to the element being located in the subinterval

13. The apparatus of claim 10, wherein the first determining unit is further configured to:

determining the first division point as the first maximum value Max1

To divide the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

14. The apparatus of claim 13, wherein the second determining unit comprises:

a fourth determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element;

a fifth determining subunit configured to determine that the absolute value a is located in the subinterval in response to

Or

A sixth determining subunit configured to determine that the absolute value a is within the sub-interval in response to the absolute value a being within the sub-interval

15. The apparatus of any of claims 9-14, further comprising:

a third determining module configured to determine, for each element in the first matrix and each element in the second matrix, a recovery factor corresponding to the element, the recovery factors satisfying the following formula:

16. The apparatus according to claim 15, wherein the absolute value of each element in the third matrix is equal to the mantissa bit of the conversion element of the third element corresponding to the element in the first matrix multiplied by the mantissa bit of the conversion element of the fourth element corresponding to the element in the second matrix multiplied by the recovery factor corresponding to the third element multiplied by the recovery factor corresponding to the fourth element, and the sign bit of the element in the third matrix is the exclusive or value of the sign bit of the third element and the sign bit of the fourth element.

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.

19. A computer program product comprising a computer program, wherein the computer program realizes the method of any one of claims 1-8 when executed by a processor.

20. An electronic circuit, comprising:

circuitry configured to perform the method of any of claims 1-8.