CN115238236A

CN115238236A - Data processing method, data processing device, electronic equipment, medium and chip

Info

Publication number: CN115238236A
Application number: CN202210946383.6A
Authority: CN
Inventors: 王勇; 陈庆澍; 王京; 欧阳剑; 邰秀瑢
Original assignee: Kunlun Core Beijing Technology Co ltd
Current assignee: Kunlun Core Beijing Technology Co ltd
Priority date: 2022-08-08
Filing date: 2022-08-08
Publication date: 2022-10-25

Abstract

The disclosure provides a data processing method, a data processing device, an electronic device, a medium and a chip, and relates to the technical field of artificial intelligence, in particular to the technical field of artificial intelligence chips. The implementation scheme is as follows: obtaining a first matrix and a second matrix, wherein each element in the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; for each element in the first matrix and the second matrix, converting the element into a corresponding conversion element, wherein the conversion element is stored in the second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m; and calculating the product of the first matrix and the second matrix as a third matrix aiming at the conversion elements corresponding to each element in the first matrix and the second matrix respectively.

Description

Data processing method, data processing device, electronic equipment, medium and chip

Technical Field

The present disclosure relates to the field of artificial intelligence technology, and in particular, to the field of artificial intelligence chip technology, and more particularly, to a data processing method, apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

Artificial intelligence is the subject of research that makes computers simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

Artificial intelligence models exist with a large number of computationally intensive operators, mainly including matrix multiplication, convolution, pooling, activation, and so on. These calculations are very time consuming, and the computing power of the traditional CPU is difficult to meet the requirements in terms of performance, so heterogeneous calculations become the mainstream, and various artificial intelligence processors including GPU, FPGA, and ASIC are largely applied to artificial intelligence model calculations. Meanwhile, the selection of the data type plays an important role in the precision, performance and the like of the artificial intelligent calculation.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device, a computer-readable storage medium, a computer program product, and a chip.

According to an aspect of the present disclosure, there is provided a data processing method including: obtaining a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; for each element in the first and second matrices, converting the element into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element to which the conversion element corresponds; and calculating a product of the first matrix and the second matrix into a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix respectively.

According to another aspect of the present disclosure, there is provided a data processing apparatus including: an obtaining module configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; a conversion module configured to convert, for each element in the first matrix and the second matrix, the element into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element to which the conversion element corresponds; and the calculation module is configured to calculate a product of the first matrix and the second matrix into a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix respectively.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described method.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the above method.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program, wherein the computer program realizes the above method when executed by a processor.

According to another aspect of the present disclosure, there is provided an electronic circuit comprising: circuitry configured to perform the above-described method.

According to one or more embodiments of the present disclosure, a data processing method is provided, which uses a data format having more mantissa bits than conventional floating-point type data, so as to improve the precision of calculation and further improve the accuracy of artificial intelligence model training and reasoning. Meanwhile, the original floating point type data is mapped to all mantissa bits, so that the floating point number is converted into fixed point calculation without an exponent, the calculation difficulty is reduced, the calculation efficiency is improved, and hardware resources for calculation are saved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of example only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure;

FIG. 2 illustrates a flow diagram of a method of converting elements in a matrix to conversion elements, in accordance with an embodiment of the disclosure;

FIG. 3 illustrates a flow diagram of a method of converting elements in a first matrix to conversion elements, in accordance with an embodiment of the disclosure;

FIG. 4 shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure; and

FIG. 5 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", and the like to describe various elements is not intended to limit the positional relationship, the temporal relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

In the related art, in the calculation process of the artificial intelligence model, a standard IEEE float type is mainly used, and with the continuous development of the technology, some new calculation types are generated to replace the standard float, such as a bffloat 16, fp16, int16 and other new semi-precision calculation types. fp16 and bfoat 16 include sign bits, exponent bits, and mantissa bits, and have exponent bits, so that the range of values that can be represented is large, but the calculation is complicated due to the large number of exponent bits, and thus the calculation efficiency is low compared to int 16. int16 is a fixed-point calculation, and the calculation efficiency is higher than fp16/bfloat16, but because there is no exponent bit, all the number ranges that can be expressed are smaller than fp16/bfloat16, and in many artificial intelligence model training processes, the situation of non-convergence occurs, so the range of use is relatively limited.

In order to solve the above problems, the present disclosure provides a data processing method, which uses a data format having more mantissa bits than the conventional floating point type data, so as to improve the precision of calculation and further improve the accuracy of artificial intelligence model training and reasoning. Meanwhile, the original floating point type data is mapped to all mantissa bits, so that the floating point number is converted into fixed point calculation without an exponent, the calculation difficulty is reduced, the calculation efficiency is improved, and hardware resources for calculation are saved.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 shows a flow diagram of a data processing method according to an embodiment of the present disclosure. As shown in fig. 1, the data processing method 100 includes: step S101, a first matrix and a second matrix are obtained, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and the first data format is a floating point type and has 1 sign bit and m mantissa bits, and m is an integer greater than 1; step S102, converting each element in the first matrix and the second matrix into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element corresponding to the conversion element; step S103, calculating a product of the first matrix and the second matrix as a third matrix based on a conversion element corresponding to each element in the first matrix and the second matrix.

Thus, each element stored in the first data format in the first matrix and the second matrix is converted into a conversion element stored in the second data format, respectively, in step S102, thereby representing each element in the first matrix and the second matrix with a more-bit mantissa bit. Specifically, the value of each element is converted to the mantissa bits of the corresponding conversion element, and the floating-point type matrix calculation is converted to the fixed-point type matrix operation, so that the difficulty of matrix calculation can be effectively reduced, and the hardware resource for calculation is saved. Therefore, the floating point type data in the first data format is converted into the second data format with more mantissa bits, so that the calculation precision can be improved, and the training and reasoning accuracy of the artificial intelligence model for performing matrix operation by adopting the method can be improved. Meanwhile, the original floating point type data are converted to all mantissa bits of the conversion elements stored in the second data format, so that the floating point number matrix calculation is converted into fixed point calculation without an exponent.

According to some embodiments, the first data format and the second data format each have 16 bits. Therefore, by providing the second data format which has 16 bit data in total and 14 exponent bits in the data, the multiplication operation of the floating point type data in the traditional first data format such as fp16, bfoat 16 and the like is converted into fixed point calculation on 14 mantissa bits, so that the hardware implementation is simpler, the efficiency is higher, the efficiency can be close to the integer type int16 with the same number of bits, the use of hardware resources can be reduced compared with the calculation types such as fp16 and bfoat 16, and the peak performance of the artificial intelligence chip can be improved.

For convenience of description, the following description will be given taking as an example that the second data format contains 16-bit data and the mantissa bit number n is 14. However, it is understood that the present disclosure is not limited to 16-bit data conversion, and may also be used for conversion of 32-bit single-precision floating point numbers or 64-bit double-precision floating point numbers to convert floating point operations including exponent operations into fixed point operations not including exponents, thereby improving operation efficiency and saving hardware computing resources.

Fig. 2 shows a flow diagram of a method of converting elements in a matrix into conversion elements according to an embodiment of the disclosure. As shown in fig. 2, step S102 includes: step S201, determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1; step S202, based on the first maximum value Max1, mapping each element in the first matrix to [0,2^ n ] interval so as to convert the element into a corresponding conversion element; step S203, determining a second element with the maximum absolute value in the second matrix, and recording the maximum absolute value of the second element as a second maximum value Max2; and step S204, based on the second maximum value Max2, mapping each element in the second matrix to [0,2^ n ] interval to convert the element into a corresponding conversion element.

Thus, the data distribution in the two matrices is determined by determining the maximum value of the absolute value in the first matrix and the second matrix, respectively, to perform mapping of data according to the respective data distribution. Specifically, each element in the matrix is mapped to the [0,2^ n ] interval based on the maximum value of the absolute value of the matrix to represent the absolute value of the corresponding element with all mantissa bits in the conversion element stored in the second data format, thereby enabling conversion of floating point calculations to fixed point calculations in subsequent matrix calculations.

FIG. 3 illustrates a schematic according to the present disclosureA flow diagram of a method of converting elements in a first matrix to conversion elements of an embodiment of (1). As shown in fig. 3, step S202 includes: step S301, based on the first maximum value, determining a first division point to divide the interval [0, max1]]Dividing the space into two subintervals; and step S302, aiming at each element in the first matrix, determining the exponent bit of the conversion element corresponding to the element based on the subinterval where the element is positioned, and mapping the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element to which the element corresponds.

It will be appreciated that the distribution range of each element in the first matrix, i.e. [0, max1], may be determined by determining the maximum of the absolute values of the elements in the matrix. The distribution range is further segmented by determining the segmentation point, so that mapping calculation is respectively carried out according to the subintervals where each element falls, the distribution range of the data is divided more carefully, and higher calculation accuracy is obtained.

According to some embodiments, step S301 comprises: determining the first division point as the first maximum value Max1

To divide the interval [0, max1]]Divided into two sub-intervals

And

taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, each element in the first matrix is mapped to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. The distribution of elements in the first matrix may be spaced by [0, max1]]Is divided into 2 ¹⁴ Serving and taking the position of the first portion

The distribution interval [0, ma ] is taken as a division pointx1]Divided into two sub-intervals

And

so that data represented in the second data format has at least

The accuracy of (2). Compared with the first data format, the second data format has more mantissa bits, so that the second data format has higher precision and can meet the precision requirement of most artificial intelligence models.

According to some embodiments, step S302 comprises: for each element in the first matrix, determining an absolute value a of the element; in response to the element being in the subinterval

In the method, the exponent bit of the conversion element corresponding to the element is determined to be 0, and the element is mapped to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to the element as

Or in response to the element being in a sub-interval

In the method, the exponent bit of the conversion element corresponding to the element is determined to be 1, and the element is mapped to [0,2 ] ⁿ ]To determine the mantissa bits of the element converted into the corresponding conversion element as

Further, by determining the subinterval in which the element is located, the distribution range of the element can be determined more accurately, thereby obtaining a higher levelAnd calculating the precision. When it is determined that an element falls within

When the interval has smaller value-taking distribution, the interval will have

This interval is subdivided into 2 ¹⁴ Is prepared by

The precision in the interval can be improved to

While elements with larger numbers fall within a larger range of sub-ranges

In yet have

The accuracy of (2).

It will be appreciated that the position of the set segmentation point may determine the maximum value corresponding to the element within each subinterval, and thus the accuracy of the element. Another embodiment for obtaining different precisions by setting different division points will be given below.

To divide the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

According to some embodiments, step S302 comprises: to the saidEach element in the first matrix, determining the absolute value a of the element; in response to the absolute value a being in a subinterval

Or in response to the absolute value a lying in a subinterval

Taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, each element in the first matrix is mapped to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. When the first division point is determined to be larger than

Is/are as follows

Compared to determining the first division point as

Is in the section

Chinese character of (1)The maximum value of the prime mapping is increased from Max1 to

The precision of the elements in the interval, namely the elements in the interval can be improved

The precision of the element in (1) is determined by

Is lifted to

In addition, it is possible to determine which interval the accuracy of data is to be improved, based on setting different values of i. For the artificial intelligence model with certain data distribution, the value i can be set according to the data distribution of the artificial intelligence model, so that the precision of the data in the target interval is improved.

It can be understood that the conversion process for each element in the second matrix is the same as the above-mentioned conversion process for the element in the first matrix, and data in the first data format is converted into data in the second data format in a mapping manner, which is not described herein again.

According to some embodiments, the method 100 of data processing further comprises: for each element in the first matrix and each element in the second matrix, determining a restoration factor corresponding to the element, where the restoration factors satisfy the following conditions: the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

It can be understood that the data processing method 100 is used for converting a conventional floating-point data type into a new data type with more mantissa bits, so as to convert a matrix calculation of a floating-point type into a fixed-point calculation, thereby reducing the complexity of the calculation and achieving the purpose of saving hardware resources. After the matrix calculation is performed by conversion to fixed points, the calculation result still needs to be converted to the original data type, so that the conversion process is unknown from the user perspective, and the user experience is improved.

The process of converting the matrix calculation into the original data type requires the above-mentioned recovery factor to be implemented. Specifically, when converting an element of a first data format to a conversion element of a second data format, the element is mapped to [0,2 ] by multiplying the absolute value of the element by a factor ⁿ ]The conversion of the matrix calculation into the original data type can be achieved by multiplying the calculation by the inverse of this factor, which is the recovery factor. The recovery factor is determined by satisfying the following condition: the conversion of the matrix calculation result into the original data type is achieved by multiplying the conversion element corresponding to the element by the recovery factor corresponding to the element, which is equal to the absolute value of the element.

According to some embodiments, an absolute value of each element in the third matrix is equal to a mantissa bit of a conversion element of a third element corresponding to the element in the first matrix multiplied by a mantissa bit of a conversion element of a fourth element corresponding to the element in the second matrix multiplied by a recovery factor corresponding to the third element multiplied by a recovery factor corresponding to the fourth element, and a sign bit of the element in the third matrix is an exclusive or value of a sign bit of the third element and a sign bit of the fourth element.

Therefore, matrix calculation of floating point numbers is converted into fixed point calculation of more mantissa numbers, calculation complexity is reduced, and calculation efficiency is improved. Due to the increase of mantissa bits, the precision of data is also improved. Meanwhile, through the process, automatic conversion of data is realized, the calculation result can be automatically converted back to the original data type, a user cannot feel the data type and the data conversion process used in the calculation in use, and meanwhile, the calculation result and the calculation process with higher precision and higher efficiency can be obtained.

According to another aspect of the present disclosure, a data processing apparatus is provided. As shown in fig. 4, the data processing apparatus 400 includes: an obtaining module 401 configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1; a conversion module 402 configured to convert each element of the first matrix and the second matrix into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element to which the conversion element corresponds; and a calculation module 403 configured to calculate a product of the first matrix and the second matrix as a third matrix based on a conversion element corresponding to each element in the first matrix and the second matrix, respectively.

Thus, each element of the first and second matrices stored in the first data format is converted into a conversion element stored in the second data format by the conversion module 402, respectively, such that each element of the first and second matrices is represented by a more bit mantissa bit. The value of each element is converted to the mantissa bits of the corresponding conversion element, and the floating-point type matrix calculation is converted into the fixed-point type matrix operation, so that the difficulty of the matrix calculation can be effectively reduced, and the hardware resource for calculation is saved. Thus, the data processing apparatus 400 can improve the precision of calculation by converting the floating point type data in the first data format into the second data format having more mantissa bits, and can improve the accuracy of training and reasoning of the artificial intelligence model for performing matrix operations using this method. Meanwhile, the conversion module 402 converts the original floating-point type data to all mantissa bits of the conversion elements stored in the second data format, so as to convert the floating-point number matrix calculation into fixed-point calculation without an exponent.

It is understood that the data processing apparatus 400 provided by the present disclosure is not limited to be used for converting 16-bit data, and may also be used for converting 32-bit single-precision floating point data or 64-bit double-precision floating point data to convert a floating point operation including an exponent operation into a fixed point operation not including an exponent, thereby improving operation efficiency and saving hardware computing resources.

According to some embodiments, the conversion module 402 comprises: a first determining unit configured to determine a first element with a largest absolute value in the first matrix, and record the largest absolute value of the first element as a first largest value Max1; a first mapping unit configured to map each element of the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]An interval to convert the element into a corresponding conversion element; a second determining unit configured to determine a second element having a largest absolute value in the second matrix, and record a largest absolute value of the second element as a second largest value Max2; a second mapping unit configured to map each element of the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]Interval to convert the element into a corresponding conversion element.

Thus, the conversion module 402 determines the data distribution in the first matrix and the second matrix by determining the maximum value of the absolute values in the two matrices, respectively, to perform mapping of data according to the respective data distributions. Specifically, each element in the matrix is mapped to the [0,2^ n ] interval based on the maximum value of the absolute value of the matrix to represent the absolute value of the corresponding element with all mantissa bits in the conversion element stored in the second data format, thereby realizing conversion of floating point calculation into fixed point calculation in subsequent matrix calculation.

According to some embodiments, the first mapping unit comprises: a first determination unit configured to determine a first division point to divide an interval [0, max1] based on the first maximum value]Dividing the space into two subintervals; and a second determining subunit configured to determine, for each element in the first matrix, an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and map the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element to which the element corresponds.

It is to be understood that the first determining subunit may determine the distribution range of each element in the first matrix, i.e., [0, max1], by determining the maximum value of the absolute values of the elements in the matrix. The distribution range is further segmented by determining the segmentation point, so that mapping calculation is respectively carried out according to the subintervals where each element falls, the distribution range of the data is divided more carefully, and higher calculation accuracy is obtained.

According to some embodiments, the first sub-determination unit is further configured to: determining the first division point as the first maximum value Max1

To divide the interval [0, max1]]Divided into two sub-intervals

And

taking n as 14 for example, when the second data format is 16-bit data having 14 mantissa bits, the first determining subunit needs to map each element in the first matrix to [0,2 ] ¹⁴ ]To represent the absolute value of each element by 14 mantissa bits. First to sureThe stator unit may divide the distribution of elements in the first matrix by [0, max1]]Is divided into 2 ¹⁴ Serving and taking the position of the first portion

Distribution interval [0, max1] as a division point]Divided into two sub-intervals

And

so that data represented in the second data format has at least

According to some embodiments, the second determining subunit comprises: a third determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element; a fourth determining subunit configured to respond to the element being located in the subinterval

Or a fifth determining subunit configured to determine whether the element is located in the sub-interval in response to the element being located in the sub-interval

In the method, the exponent number of the conversion element corresponding to the element is determinedIs 1 and maps the element to [0,2 ] ⁿ ]To determine the mantissa bits of the element converted into the corresponding conversion element as

Further, the subintervals where the elements are located are determined by the fourth determining subunit and the fifth determining subunit, so that the distribution range of the elements can be determined more accurately, and higher calculation accuracy is obtained. When the second determining subunit determines that the element falls into

When the interval has smaller value-taking distribution, the interval will have

This interval is subdivided into 2 ¹⁴ Is prepared by

The precision in the interval can be improved to

With elements having larger numbers falling within a larger range of sub-ranges

In yet have

The accuracy of (2).

According to some embodiments, the first determining subunit is further configured to: determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

According to some embodiments, the second determining subunit comprises: a sixth determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element; a seventh determining subunit configured to determine that the absolute value a is within the sub-interval in response to the absolute value b being within the sub-interval

Or an eighth determining subunit configured to determine that the absolute value a is located in the subinterval in response to the absolute value a being located in the subinterval

Is/are as follows

Compared to determining the first division point as

Is in the section

The maximum value of the element in (1) is increased from Max1 to

The precision of the element in (1) is determined by

Is lifted to

It can be understood that the conversion process of the data processing apparatus 400 on each element in the second matrix is the same as the above-mentioned conversion process on the element in the first matrix, and data in the first data format is converted into data in the second data format in a mapping manner, which is not described herein again.

According to some embodiments, the data processing apparatus 400 further comprises: a determining module configured to determine, for each element in the first matrix and each element in the second matrix, a recovery factor corresponding to the element, where the recovery factors satisfy the following conditions: the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

It is understood that the data processing apparatus 400 is used to convert a conventional floating-point data type into a new data type with more mantissa bits, so as to convert a floating-point matrix calculation into a fixed-point calculation, thereby reducing the complexity of the calculation and achieving the purpose of saving hardware resources. After the matrix calculation is performed by conversion to fixed points, the calculation result still needs to be converted to the original data type, so that the conversion process is unknown from the user perspective, and the user experience is improved.

The process of converting the matrix calculation result into the original data type by the data processing apparatus 400 needs the recovery factor determined by the fifth determining module. Specifically, when converting an element of a first data format to a conversion element of a second data format, the element is mapped to [0,2 ] by the first mapping module 303 multiplying the absolute value of the element by a factor ⁿ ]The conversion of the matrix calculation into the original data type can be achieved by multiplying the calculation by the inverse of this factor, which is the recovery factor. The recovery factor is determined by satisfying the following condition: the conversion of the matrix calculation result into the original data type is achieved by multiplying the conversion element corresponding to the element by the recovery factor corresponding to the element, which is equal to the absolute value of the element.

Thus, the data processing apparatus 400 converts the matrix calculation of the floating-point number into the fixed-point calculation of more mantissa bits, reduces the complexity of the calculation, and improves the efficiency of the calculation. Due to the increase of the mantissa bits, the precision of the data is also improved. Meanwhile, through the above process, the data processing apparatus 400 realizes automatic conversion of data, the calculation result can be automatically converted back to the original data type, the user cannot feel the data type and the data conversion process specifically used in the calculation in use, and meanwhile, the calculation result and the calculation process with higher precision and higher efficiency can be obtained.

According to another aspect of the present disclosure, there is also provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data processing method.

According to another aspect of the present disclosure, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute a data processing method.

According to another aspect of the present disclosure, there is also provided a computer program product comprising a computer program, wherein the computer program realizes the data processing method when executed by a processor.

According to another aspect of the present disclosure, there is also provided an electronic circuit comprising: a circuit configured to perform the data processing method, which electronic circuit may be implemented as a chip.

As shown in fig. 5, the electronic device 500 includes a computing unit 501, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in the electronic device 500 are connected to the I/O interface 505, including: an input unit 506, an output unit 507, a storage unit 508, and a communication unit 509. The input unit 506 may be any type of device capable of inputting information to the electronic device 500, and the input unit 506 may receive input numeric or character information, and generate key signal inputs related to user settings and/or function control of the electronic device,and may include, but is not limited to, a mouse, keyboard, touch screen, track pad, track ball, joystick, microphone, and/or remote control. Output unit 507 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 508 may include, but is not limited to, a magnetic disk, an optical disk. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as bluetooth ^TM Devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 501 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as a data processing method. For example, in some embodiments, the data processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the data processing method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the data processing method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical aspects of the present disclosure can be achieved.

While embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely illustrative embodiments or examples and that the scope of the invention is not to be limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. A method of data processing, comprising:

obtaining a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1;

for each element in the first and second matrices, converting the element into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element to which the conversion element corresponds; and

and calculating a product of the first matrix and the second matrix into a third matrix based on the conversion element corresponding to each element in the first matrix and the second matrix respectively.

2. The method of claim 1, wherein for each element in the first and second matrices, converting the element to a corresponding conversion element comprises:

determining a first element with the largest absolute value in the first matrix, and recording the largest absolute value of the first element as a first largest value Max1;

mapping each element in the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]An interval to convert the element into a corresponding conversion element;

determining a second element with the largest absolute value in the second matrix, and recording the largest absolute value of the second element as a second largest value Max2; and

mapping each element in the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]Interval to convert the element into a corresponding conversion element.

3. The method of claim 2, wherein the mapping of each element in the first matrix to [0,2 ] is based on the first maximum value Max1 ⁿ ]The interval to convert the element into a corresponding conversion element includes:

based on the first maximum, determining a first split point to split the interval [0, max1] into two sub-intervals; and

for each element in the first matrix, determining the exponent bit of the conversion element corresponding to the element based on the subinterval in which the element is located, and mapping the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to that element.

4. The method of claim 3, wherein the determining a partitioning point to partition an interval [0, max1] into two sub-intervals based on the first maximum value Max1 comprises:

determining the first division point as the first maximum value Max1

To divide the interval [0, max1]]Divided into two sub-intervals

And

5. the method of claim 4, wherein for each element in the first matrix, determining an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and mapping the element to [0,2 ] ⁿ ]Determining the mantissa bits of the conversion element corresponding to the element comprises:

for each element of the first matrix,

determining the absolute value a of the element;

in response to the element being in a subinterval

Or

In response to the element being in a subinterval

6. The method of claim 3, wherein the determining a partitioning point to partition an interval [0, max1] into two sub-intervals based on the first maximum value Max1 comprises:

determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

7. The method of claim 5, wherein for each element in the first matrix, determining an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and mapping the element to [0,2 ] ⁿ ]Determining the mantissa bit of the conversion element corresponding to the element comprises:

for each element of the first matrix,

determining the absolute value a of the element;

in response to the absolute value a being in a subinterval

Or

In response to the absolute value a being in a subinterval

8. The method of any of claims 1-7, further comprising:

for each element in the first matrix and each element in the second matrix, determining a recovery factor corresponding to the element, where the recovery factors satisfy the following conditions:

the conversion element corresponding to the element multiplied by the recovery factor corresponding to the element is equal to the absolute value of the element.

9. The method according to claim 8, wherein the absolute value of each element in the third matrix is equal to the mantissa bit of the conversion element of the third element corresponding to the element in the first matrix multiplied by the mantissa bit of the conversion element of the fourth element corresponding to the element in the second matrix multiplied by the recovery factor corresponding to the third element multiplied by the recovery factor corresponding to the fourth element, and the sign bit of the element in the third matrix is the exclusive or value of the sign bit of the third element and the sign bit of the fourth element.

10. A data processing apparatus comprising:

an obtaining module configured to obtain a first matrix and a second matrix, wherein each element of the first matrix and the second matrix is stored in a first storage unit in a first data format, and wherein the first data format is of a floating point type and has 1 sign bit and m mantissa bits, m being an integer greater than 1;

a conversion module configured to convert, for each element in the first matrix and the second matrix, the element into a corresponding conversion element, wherein the conversion element is stored in a second storage unit in a second data format, and wherein the second data format has 1 sign bit, 1 exponent bit, and n mantissa bits, n being an integer greater than m, and wherein the second data format and the first data format have the same number of bits, and the mantissa bit of each conversion element represents a value of the element to which the conversion element corresponds; and

a calculation module configured to calculate a product of the first matrix and the second matrix as a third matrix based on a conversion element corresponding to each element in the first matrix and the second matrix, respectively.

11. The apparatus of claim 10, wherein the conversion module comprises:

a first determining unit configured to determine a first element with a largest absolute value in the first matrix, and record the largest absolute value of the first element as a first largest value Max1;

a first mapping unit configured to map each element of the first matrix to [0,2 ] based on the first maximum value Max1 ⁿ ]A section to convert the element into a corresponding conversion element;

a second determining unit configured to determine a second element having a largest absolute value in the second matrix, and record a largest absolute value of the second element as a second largest value Max2;

a second mapping unit configured to map each element of the second matrix to [0,2 ] based on the second maximum value Max2 ⁿ ]Interval to convert the element into a corresponding conversion element.

12. The apparatus of claim 11, wherein the first mapping unit comprises:

a first determination subunit configured to determine, based on the first maximum value, a first division point to divide an interval [0, max1] into two subintervals; and

a second determining subunit configured to determine, for each element in the first matrix, an exponent bit of a conversion element corresponding to the element based on a subinterval in which the element is located, and map the element to [0,2 ] ⁿ ]To determine the mantissa bits of the conversion element corresponding to that element.

13. The apparatus of claim 12, wherein the first determining subunit is further configured to:

determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

14. the apparatus of claim 13, wherein the second determining subunit comprises:

a third determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element;

a fourth determining subunit configured to respond to the element being located in the subinterval

Or

A fifth determining subunit configured to respond to the element being located in a subinterval

15. The apparatus of claim 12, wherein the first sub-determination unit is further configured to:

determining the first division point as the first maximum value Max1

To connect the interval [0, max1]]Divided into two sub-intervals

And

wherein i is a positive integer less than n.

16. The apparatus of claim 15, wherein the second sub-determination unit comprises:

a sixth determining subunit configured to determine, for each element in the first matrix, an absolute value a of the element;

a seventh determining subunit configured to determine that the absolute value a is located in the subinterval in response to

In the method, the exponent bit of the conversion element corresponding to the element is determined to be 0, and the element is mapped to [0,2 ⁿ ]to determine the mantissa bits of the conversion element corresponding to the element as

Or

An eighth determining subunit configured to determine that the absolute value a is within the subinterval in response to the absolute value a being within the subinterval

17. The apparatus of any of claims 10-16, further comprising:

a determining module configured to determine, for each element in the first matrix and each element in the second matrix, a restoration factor corresponding to the element, the restoration factor satisfying the following formula:

18. The apparatus according to claim 17, wherein the absolute value of each element in the third matrix is equal to the mantissa bit of the conversion element of the third element corresponding to the element in the first matrix multiplied by the mantissa bit of the conversion element of the fourth element corresponding to the element in the second matrix multiplied by the recovery factor corresponding to the third element multiplied by the recovery factor corresponding to the fourth element, and the sign bit of the element in the third matrix is an exclusive or value of the sign bit of the third element and the sign bit of the fourth element.

19. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.

21. A computer program product comprising a computer program, wherein the computer program realizes the method of any one of claims 1-9 when executed by a processor.

22. An electronic circuit, comprising:

circuitry configured to perform the method of any of claims 1-9.