CN111651486B

CN111651486B - Processing device and method

Info

Publication number: CN111651486B
Application number: CN202010446944.7A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2023-06-27
Anticipated expiration: 2040-05-25
Also published as: CN111651486A

Abstract

The application discloses a processing device and a processing method, wherein the device comprises a data transmission unit and a floating point searching unit, and the floating point searching unit comprises at least one floating point searching subunit; the floating point searching unit is used for receiving input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, and the input data is floating point data; the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point searching unit, receiving the floating point searching sub-result, and sequencing the floating point searching sub-result according to the input data to obtain a floating point searching result. According to the method and the device, the operation of the lookup table is processed in parallel by configuring at least one floating point lookup subunit, so that the operation speed of the lookup table operation is effectively improved.

Description

Processing device and method

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a processing apparatus and a method.

Background

In the field of artificial intelligence application, the look-up table technology is a common processing mode, for example, in the field of image processing, by mapping the colors of an image, the functions of enhancing the contrast of the image, adjusting the color difference and the like can be realized by changing the color distribution of the image. At present, the lookup table technology establishes a mapping relation between an index number and an output value, and rapidly obtains the output value according to the input value and the mapping relation, but when the input value of the lookup table is floating point data, the conventional lookup table technology cannot directly search the floating point data, the floating point data is required to be converted into integer data to be searched to obtain a search result, the search result is further converted into the floating point data to be output, the whole search period is long, the search process is complex and tedious, and when the input data is a plurality of data, the input value is required to be subjected to traversal search, so that the operation speed of the whole search is slow, and the search efficiency is low.

Disclosure of Invention

The embodiment of the application provides a processing device and a processing method, which can be used for processing the lookup table operation in parallel by configuring at least one floating point lookup subunit, so that the operation speed of the lookup table operation is effectively improved.

In a first aspect, an embodiment of the present application provides a processing apparatus, where the processing apparatus includes a data transmission unit, a floating point lookup unit, and where: the floating point lookup unit comprises at least one floating point lookup subunit;

the floating point searching unit is used for receiving input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, and the input data is floating point data;

the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point searching unit, receiving the floating point searching sub-results, and sequencing the floating point searching sub-results according to the input data to obtain floating point searching results.

Optionally, the floating point lookup subunit includes: configuration module, seek module and operation module, wherein:

The configuration module is used for setting configuration information corresponding to each floating point searching subunit in the floating point searching unit, and the configuration information comprises a configuration table and a searching table;

the searching module is used for determining the data range of the input data currently searched by each floating point searching subunit in the floating point searching unit according to the configuration information, determining the data to be processed from the input data according to the data range, and sending the data to be processed to the corresponding operation module;

the operation module is used for processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.

Optionally, the configuration table includes: a first parameter, a second parameter, a third parameter, and a fourth parameter;

the first parameter is used to determine the size of a lookup table segment, the second parameter is used to determine the index range of the lookup table, the third parameter is used to determine the lookup table segment, and the fourth parameter is used to determine the number of segments of the lookup table segment.

Optionally, the search module is specifically configured to:

determining the number of segments of the search segments of the floating point search unit for executing the lookup table operation and the data range of each segment for executing the lookup table operation according to the first parameter and the second parameter;

Determining a data range according to the third parameter, the fourth parameter, the number of segments and the data range corresponding to each segment for executing the lookup table operation;

and determining the data to be processed of the floating point searching subunit according to the data range.

Optionally, the data transmission unit is further configured to:

and determining whether all data in the input data are subjected to lookup table operation according to the floating point lookup result to obtain a verification result.

Optionally, the configuration module is further configured to:

and resetting configuration information corresponding to each floating point searching subunit in the floating point searching unit when the verification result is that all data in the input data are not subjected to the lookup table operation.

Optionally, the processing device further includes a storage unit; the storage unit is used for storing the input data and the floating point search result.

In a second aspect, an embodiment of the present application provides a processing method, where the processing apparatus includes a data transmission unit and a floating point lookup unit, and the floating point lookup unit includes at least one floating point lookup subunit, and the method includes:

The data transmission unit receives input data and broadcasts the input data to the floating point searching unit;

the floating point searching unit receives the input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, and the input data is floating point data;

and the data transmission unit receives the floating point searching sub-result, and sorts the floating point searching sub-result according to the input data to obtain a floating point searching result.

Optionally, the method further comprises:

setting configuration information corresponding to each floating point searching subunit in the floating point searching unit, wherein the configuration information comprises a configuration table and a searching table;

determining a data range of input data currently searched by each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to a corresponding operation module;

and processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.

Optionally, the determining the data to be processed from the input data according to the data range includes:

Optionally, the method further comprises:

Optionally, the method further comprises: and resetting configuration information corresponding to each floating point searching subunit in the floating point searching unit when the verification result is that all data in the input data are not subjected to the lookup table operation.

Optionally, the method further comprises: and storing the input data and the floating point search result.

In a third aspect, embodiments of the present application provide a computer device comprising a processor, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the processor, the program comprising instructions for performing the method of any of the second aspects.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium comprising a computer program stored for data exchange, which when executed by a processor, implements some or all of the steps as described in the second aspect of embodiments of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product, wherein the computer program product comprises a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps described in the second aspect of embodiments of the present application. The computer program product may be a software installation package.

It can be seen that, in this embodiment of the present application, through a data transmission unit and a floating point lookup unit, the floating point lookup unit includes at least one floating point lookup subunit, where the floating point lookup unit is configured to receive input data, each floating point lookup subunit in the floating point lookup unit performs a lookup table operation in parallel according to the corresponding input data, to obtain a corresponding floating point lookup sub-result, and returns the floating point lookup sub-result to the data transmission unit, where the input data is floating point type data; the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point searching unit, receiving the floating point searching sub-result, and sequencing the floating point searching sub-result according to the input data to obtain a floating point searching result. According to the method and the device, the lookup table operation is processed in parallel by configuring at least one floating point lookup subunit, so that the operation speed of the lookup table operation is effectively improved, the lookup table operation is directly executed on floating point data, and the cost caused by data type conversion is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic structural diagram of a computer device according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a processing device according to an embodiment of the present application;

FIG. 3 is a schematic view of another processing device according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a floating point lookup subunit according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of mapping relation of a lookup table according to an embodiment of the present application;

FIG. 6 is a schematic diagram of another processing device according to an embodiment of the present disclosure;

fig. 7 is a schematic flow chart of a processing method according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The following describes the technical solution of the present application and how the technical solution of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a computer device provided in an embodiment of the present application, and as shown in fig. 1, the computer device may include a processor, a memory, and one or more programs stored in the memory and configured to be processed by the processor. The computer device may further include a communication bus, an input device, and an output device, where the processor, the memory, the input device, and the output device may be interconnected by the bus.

The processor is configured to implement the following steps when executing the program stored in the memory:

the floating point searching unit receives the input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, and the input data is integer data;

Further, the processor may be a central processing unit (Central Processing Unit, CPU), an intelligent processor (Intelligence Processing Unit, NPU), a graphics processor (Graphics Processing Unit, GPU) or an image processor (Image Processing Unit), which is not limited in this application. According to different processors, the processing method provided by the embodiment of the application can be applied to the artificial intelligence application fields such as image recognition processing, deep learning processing, computer vision processing, intelligent robot processing, natural language processing and the like, and complex functional programs in the artificial intelligence field are executed, for example, in the aspect of image recognition processing, the image processor can increase the contrast and brightness of an image through the operation of a lookup table; in terms of computer vision processing, the processor may adjust color differences of images on the display screen through look-up table operations.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a processing apparatus 200 according to an embodiment of the present application, where the apparatus 200 is applied to the computer device shown in fig. 1. As shown in fig. 2, the apparatus 200 includes: a data transmission unit 21, a floating point lookup unit 22, said floating point lookup unit 22 comprising at least one floating point lookup subunit 220;

the floating point searching unit 22 is configured to receive input data, each floating point searching subunit 220 in the floating point searching unit 22 performs a lookup table operation in parallel according to the corresponding input data, so as to obtain a corresponding floating point searching sub-result, and return the floating point searching sub-result to the data transmission unit 21, where the input data is floating point type data;

the data transmission unit 21 is configured to receive the input data, broadcast the input data to the floating point lookup unit 22, and receive the floating point lookup sub-result, and sort the floating point lookup sub-result according to the input data to obtain a floating point lookup result.

Specifically, the floating point lookup unit 22 includes one or more floating point lookup sub-units 220, as shown in fig. 2, which may be a floating point lookup sub-unit 1, a floating point lookup sub-unit 2, … …, and a floating point lookup sub-unit N, where N is a positive integer, and each floating point lookup sub-unit 220 is configured to implement a lookup table function of a portion of data. Upon receipt of the input data by the data transfer unit 21, the input data is broadcast to each floating point lookup subunit 220 of the floating point lookup unit 22 to perform a lookup table operation. Each floating point lookup subunit 220 receives the input data, performs a lookup table operation on only data in the data range of the current floating point lookup subunit 220 in the input data, and each floating point lookup subunit 220 returns an output floating point lookup subunit result to the data transfer unit 21. The data transmission unit gathers the data according to the storage position of the input data, and writes the floating point searching sub-result back to the storage position corresponding to the corresponding input data.

For example, as shown in FIG. 3, where the input data is abcdefgh, the floating point lookup unit 22 includes a floating point lookup subunit 1, a floating point lookup subunit 2, and a floating point lookup subunit 3. After receiving abcdefgh, data transfer unit 21 broadcasts it to floating point lookup subunit 1, floating point lookup subunit 2, and floating point lookup subunit 3. After receiving abcdefgh, the floating point searching subunit 1 only performs a lookup table operation on ac data in the input data to obtain a floating point searching subunit result a 'c'; after receiving abcdefgh, the floating point searching subunit 2 only performs a lookup table operation on the deg data in the input data to obtain a floating point searching subunit d ' e ' g '; after receiving abcdefgh, the floating point searching subunit 3 performs a lookup table operation on only bf data in the input data, to obtain a floating point searching sub-result b 'f'. The floating point lookup subunit 1, the floating point lookup subunit 2 and the floating point lookup subunit 3 then send the floating point lookup subunit results to the data transfer unit 21, respectively. After receiving the floating point search sub-result, the data transmission unit 21 performs convergence sorting according to the storage location of the input data, to obtain a floating point search result a ' b ' c'd ' e ' f ' g '.

Further, one or more floating point lookup sub-units 220 included in the floating point lookup unit 22 may adjust the number of floating point lookup sub-units 221 according to hardware design requirements, such as hardware area requirements or performance requirements, etc. Each floating point lookup subunit 220 is responsible for data lookup table operations within a certain range, and when the data range of the input data exceeds the coverage range of all floating point lookup subunits 220, the floating point lookup subunits 220 can be multiplexed by configuring configuration information of each floating point lookup subunit 220 in the floating point lookup unit 22 multiple times, thereby realizing the lookup table function.

In the embodiment of the present application, the at least one floating point lookup subunit 220 is configured to process the lookup table operation in parallel, so that each floating point lookup subunit 220 can process part of the data in the input data in parallel, thereby effectively shortening the operation period of the lookup table of the input data, and improving the operation speed of the lookup table operation.

Optionally, as shown in fig. 4, fig. 4 is a schematic structural diagram of a floating point lookup subunit 220 provided in an embodiment of the present application, as shown in fig. 4, where the floating point lookup subunit 220 includes: a configuration module 221, a search module 222 and an operation module 223, wherein:

The configuration module 221 is configured to set configuration information corresponding to each floating point lookup subunit 220 in the floating point lookup unit 22, where the configuration information includes a configuration table and a lookup table;

the searching module 222 is configured to determine, according to the configuration information, a data range of the input data currently searched by each floating point searching subunit 220 in the floating point searching unit 22, determine, according to the data range of the input data currently searched by each floating point searching subunit 220 in the floating point searching unit 22, data to be processed from the input data, and send the data to be processed to the corresponding operation module 223;

the operation module 223 is configured to process the data to be processed according to the mapping relationship of the lookup table, and obtain a mapping value corresponding to the data to be processed, where the mapping value is the floating point lookup sub-result.

In the embodiment of the present application, the configuration module 221 configures a corresponding configuration table and lookup table for each floating point lookup subunit 220 according to the functional requirements. After each floating point lookup subunit 220 in the floating point lookup unit 22 receives the input data broadcast by the data transmission unit 21, a lookup table operation is performed on the input data within the processing range of the lookup table of the current floating point lookup subunit 220 according to the received input data, the configuration table and the lookup table.

Wherein the configuration table comprises: a first parameter, a second parameter, a third parameter, and a fourth parameter. The first parameter is used for determining the size of the lookup table segment, the second parameter is used for determining the index range of the lookup table, the third parameter is used for determining the lookup table segment, and the fourth parameter is used for determining the segment number of the lookup table segment.

Specifically, the first parameter may be represented by a base, which may determine the size of each look-up table segment, specifically each look-up table segment having a size of 2 ^base . The second parameter may be represented by a base_index that may determine which index range the lookup table is located in, so the data range in which the lookup table operation is performed by the current floating point lookup subunit 220 may be determined by the base and base_index, and a particular data range may be represented as [ ±2 ^{base+baseindex-1} ,±2 ^base ^+baseindex ]. The third parameter may be represented by index, which may determine a specific look-up table segment, and positive and negative of index may indicate whether the look-up table segment is located on a positive half-axis or a negative half-axis, and index is positive, indicating that the look-up table is located on a positive half-axis; index is negative, indicating that the look-up table is on the negative half-axis. Fourth parameter canDenoted by k, which may determine the number of segments within each look-up table segment. The lookup table is used for providing mapping relation of the lookup table, the segments in each lookup table segment correspond to a numerical value mapping respectively, and each lookup table segment can correspond to k numerical values. For example, if the base is 2, the base_index is 2, the index is 1, and k is 2, then the base can determine that each lookup table section is 2 in size ² I.e., 4, the index range of the lookup table can be determined by base and base_index to be [. + -2 ³ ,±2 ⁴ ]I.e. the data range of the lookup table is [ + -8, + -16]Each look-up table segment of the data range has a size of 4, the look-up table comprises 4 look-up table segments of [ -16, -12, respectively],[-12,-8],[8,12],[12,16]Determining a specific lookup table segment as [8,12 ] by index]The lookup table segment is determined to include 2 segments by k, each segment being [8,10 respectively]，[10,12]。

In the embodiment of the present application, by configuring different configuration tables and lookup tables for each floating point lookup subunit 220, each floating point lookup subunit 220 can implement a lookup table function of different part of data, and by configuring multiple floating point lookup subunits 220, configuration can be flexibly selected on the basis of performance and hardware area, so as to simplify the design of hardware.

Optionally, the searching module 222 is specifically configured to:

determining the number of segments of the lookup segment for which the floating point lookup unit 22 performs the lookup table operation and the data range for which each segment corresponds to performing the lookup table operation according to the first parameter and the second parameter;

The data to be processed of the floating point lookup subunit 220 is determined in accordance with the data range.

Specifically, each floating point lookup subunit 220 includes a configuration module 221, each configuration module 221 includes a configuration table, the number of segments of the lookup segment of each floating point lookup subunit 220 performing the lookup table operation and the data range of each segment for performing the lookup table operation can be determined through the first parameter and the second parameter in the configuration table, so that the configuration module 221 can determine the lookup data from the input data according to the data range of each segment for performing the lookup table operation, where the lookup data is the data in the data range of the floating point lookup subunit 220 performing the lookup table operation. The third parameter, the fourth parameter, the number of segments, and the data range of each segment corresponding to the operation of executing the lookup table, determine the data range of the lookup data in the corresponding lookup table segment, and determine the data to be processed of the floating point lookup subunit 220 according to the data range of the lookup data.

In the embodiment of the application, the lookup table is used for providing mapping relation of the lookup table in the data range, the segments in each lookup table segment provide a numerical mapping, and a total of 2 lookup table segments are provided ^k A number of values. The lookup module 222 determines the data to be processed within the data range of the floating point lookup subunit 220 from the input data according to the configuration table in the configuration information, then transmits the data to be processed to the operation module 223, and the operation module 223 obtains the mapping value corresponding to the data to be processed according to the mapping relation of the lookup table provided by the lookup module 221, thereby obtaining the floating point lookup subunit 220 floating point lookup sub-result, and writes the floating point lookup sub-result back to the data transmission unit 21 through the data path of the lookup module 222, wherein the mapping value is the floating point lookup sub-result.

As shown in fig. 5, fig. 5 is a schematic diagram of mapping relation of a lookup table according to an embodiment of the present application. The operation module 223 determines which data range the current data is in according to the configuration information provided by the lookup module 222, and maps the data to be processed into the lookup table data according to the mapping relation of the lookup table to obtain the mapping value. Specifically, the lookup module 222 compares the exponent bits of the input data with the first parameter to find the input data corresponding to the difference value equal to the second parameter, compares the valid high bits of the corresponding input data with the third parameter to find the data to be processed equal to the third parameter, the lookup module 222 transmits the data to be processed to the operation module 223, the operation module 223 determines the segment in the lookup table segment according to the valid bits of the data to be processed, and performs the lookup table operation to obtain the mapping value corresponding to the data to be processed.

In the embodiment of the application, the floating point data is directly subjected to the lookup table operation, so that the process of converting the floating point data into integer data and then converting the integer data into the floating point data after the lookup table function is executed is avoided, and the cost caused by data type conversion is reduced.

Optionally, the data transmission unit 21 is further configured to:

based on the floating point lookup result, it is determined whether all of the input data is to be subjected to a lookup table operation, resulting in a validation result, and the validation result is sent to each floating point lookup subunit 220 in the floating point lookup unit 22.

Specifically, after receiving the floating point lookup sub-result sent by each floating point lookup sub-unit 220, the data transmission unit 21 matches the content of the floating point lookup sub-result with the input data, determines the position of the input data corresponding to the content of the floating point lookup sub-result, and stores the content of the floating point lookup sub-result to the position of the input data, thereby obtaining the floating point lookup result. And determining whether all data in the input data are subjected to the lookup table operation according to whether the vacant positions exist in the floating point lookup result, and obtaining a verification result. If the floating point searching result has a vacant position, verifying that all data in the input data are not subjected to the operation of the searching table; if the floating point search result does not have a vacant position, verifying that all data in the input data of the result is executed with the operation of the lookup table.

Further, when the verification result is that all data in the input data is not subjected to the lookup table operation, the data transmission unit 21 may broadcast the input data into each floating point lookup subunit 220 again to perform the lookup table operation; the data transmission unit 21 may also broadcast input data corresponding to the empty position in the floating point lookup result, i.e. data not being subjected to the lookup table operation, to each floating point lookup subunit 220 to perform the lookup table operation.

Optionally, the configuration module 221 is further configured to:

and resetting configuration information corresponding to each floating point searching subunit 220 in the floating point searching unit 22 when the verification result is that all data in the input data are not subjected to the lookup table operation.

Where the data range of the input data exceeds the coverage of all the floating point lookup sub-units 220, or where there are fewer floating point lookup sub-units 220 (e.g., only one floating point lookup sub-unit 220), performing a lookup table operation on the input data once may not cover the data range of all the input data, e.g., the input data abcdefgh exceeds the data coverage of the floating point lookup sub-unit 1, the floating point lookup sub-unit 2, and the floating point lookup sub-unit 3, so that the h data in the input data is not subjected to the lookup table operation. The configuration module 221 may update the configuration information of each floating point lookup subunit 220 when the verification result indicates that all the data in the input data is not subjected to the lookup table operation, and perform the lookup table operation on the same data again.

Optionally, as shown in fig. 6, the apparatus further includes a storage unit 23, where the storage unit 23 is configured to store the input data and the floating point lookup result.

Specifically, the input data may be stored in the storage unit of the present apparatus, and the data transmission unit 21 may acquire the input data from the storage unit 23, for example, the user may store the input data in a memory set in the computer as shown in fig. 1. The data transfer unit 21 may also store the obtained floating point search result in the storage unit 23.

It can be seen that, in the processing apparatus of the embodiment of the present application, through the data transmission unit 21 and the floating point lookup unit 22, the floating point lookup unit 22 includes at least one floating point lookup subunit 220, where the floating point lookup unit 22 is configured to receive input data, each floating point lookup subunit 220 in the floating point lookup unit 22 performs a lookup table operation in parallel according to the corresponding input data, so as to obtain a corresponding floating point lookup sub-result, and returns the floating point lookup sub-result to the data transmission unit 21, where the input data is floating point data; the data transmission unit 21 is configured to receive the input data, broadcast the input data to the floating point lookup unit 22, and receive the floating point lookup sub-result, and sort the floating point lookup sub-result according to the input data to obtain a floating point lookup result. The lookup table operation is processed in parallel by configuring the at least one floating point lookup subunit 220, so that the operation speed of the lookup table operation is effectively improved, the lookup table operation is directly executed on floating point data, and the cost caused by data type conversion is reduced.

For example, when the processing device provided by the present application performs image recognition processing, the data transmission unit 21 receives image recognition input data, broadcasts the image recognition input data to the floating point lookup unit 22, the floating point lookup unit 22 receives the image recognition input data, each floating point lookup subunit 220 in the floating point lookup unit 22 performs a lookup table operation in parallel according to the corresponding image recognition input data, so as to obtain a corresponding floating point lookup sub-result, returns the floating point lookup sub-result to the data transmission unit 21, and the image recognition input data is floating point type data; the data transmission unit 21 receives the floating point search sub-results, sorts the floating point search sub-results according to the image recognition input data to obtain the floating point search results, and processes the operation of the lookup table in parallel by configuring at least one floating point search sub-unit 220, so that the operation speed of the lookup table operation is effectively improved, the operation efficiency of the image recognition system is improved, the lookup table operation is directly executed on the floating point data, and the cost caused by data type conversion is reduced.

Further, when the processing device provided by the application performs deep learning, the data transmission unit 21 receives deep learning input data, broadcasts the deep learning input data to the floating point searching unit 22, the floating point searching unit 22 receives the deep learning input data, each floating point searching subunit 220 in the floating point searching unit 22 performs a lookup table operation in parallel according to the corresponding deep learning input data to obtain a corresponding floating point searching sub-result, returns the floating point searching sub-result to the data transmission unit 21, and the deep learning input data is floating point data; the data transmission unit 21 receives the floating point search sub-results, sorts the floating point search sub-results according to the deep learning input data to obtain the floating point search results, and processes the operation of the lookup table in parallel by configuring at least one floating point search sub-unit 220, so that the operation speed of the lookup table operation is effectively improved, the operation efficiency of the deep learning system is improved, the lookup table operation is directly executed on the floating point data, and the cost caused by data type conversion is reduced.

Referring to fig. 7, fig. 7 is a flow chart of a processing method according to an embodiment of the present application, which is applied to the processing apparatus shown in fig. 2, where the processing apparatus includes a data transmission unit and a floating point searching unit, and the floating point searching unit includes at least one floating point searching subunit. As shown in fig. 7, the method includes the steps of:

s710, the data transmission unit receives input data and broadcasts the input data to the floating point searching unit;

s720, the floating point searching unit receives the input data, each floating point searching subunit in the floating point searching unit performs a searching table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, and the input data is integer data;

and S730, the data transmission unit receives the floating point searching sub-result, and sorts the floating point searching sub-result according to the input data to obtain a floating point searching result.

Optionally, the method further comprises:

It may be understood that the specific implementation manner of the processing method in the embodiment of the present application may be according to the specific implementation manner in the embodiment of the processing apparatus, and the specific implementation process may refer to the related description of the embodiment of the apparatus, which is not repeated herein.

The present application also provides a computer storage medium storing a computer program for electronic data exchange, the computer program causing a computer to execute some or all of the steps of any one of the methods described in the method embodiments above.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the methods described in the method embodiments above. The computer program product may be a software installation package.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, such as the above-described division of units, merely a division of logic functions, and there may be additional manners of dividing in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a memory, including several instructions for causing a computer device (which may be a personal computer, a terminal device, or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, etc.

The foregoing has outlined rather broadly the more detailed description of embodiments of the present application, wherein specific examples are provided herein to illustrate the principles and embodiments of the present application, the above examples being provided solely to assist in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A processing device, comprising a data transmission unit, a floating point lookup unit, wherein: the floating point lookup unit comprises at least one floating point lookup subunit;

the floating point searching unit is used for receiving input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, the input data is floating point type data, and each floating point searching sub-unit performs the lookup table operation on data in the data range of the current floating point searching sub-unit in the input data;

2. The apparatus of claim 1, wherein the floating point lookup subunit comprises: configuration module, seek module and operation module, wherein:

3. The apparatus of claim 2, wherein the configuration table comprises: a first parameter, a second parameter, a third parameter, and a fourth parameter;

4. The apparatus of claim 3, wherein the lookup module is specifically configured to:

5. The apparatus of claim 2, wherein the data transmission unit is further configured to:

6. The apparatus of claim 5, wherein the configuration module is further configured to:

7. The apparatus of any one of claims 1-6, wherein the processing apparatus further comprises a memory unit; the storage unit is used for storing the input data and the floating point search result.

8. A processing method for use in a processing device, the processing device comprising a data transfer unit, a floating point lookup unit, the floating point lookup unit comprising at least one floating point lookup subunit, the method comprising:

the floating point searching unit receives the input data, each floating point searching subunit in the floating point searching unit performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point searching sub-result, the floating point searching sub-result is returned to the data transmission unit, the input data is floating point type data, and each floating point searching sub-unit performs the lookup table operation on data in the data range of the current floating point searching sub-unit in the input data;

9. The method of claim 8, wherein the method further comprises:

10. The method of claim 9, wherein the configuration table comprises: a first parameter, a second parameter, a third parameter, and a fourth parameter;

11. The method of claim 10, wherein said determining data to be processed from said input data in accordance with said data range comprises:

12. The method according to claim 9, wherein the method further comprises:

13. The method according to claim 12, wherein the method further comprises:

14. The method according to any one of claims 8-13, further comprising:

and storing the input data and the floating point search result.

15. A computer device comprising a processor, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 8-14.

16. A computer readable storage medium, characterized in that the computer readable storage medium comprises a computer program for storing a computer program for data exchange, which computer program, when being executed by a processor, implements the method according to any of claims 8-14.