CN111651486A - Processing apparatus and method - Google Patents

Processing apparatus and method Download PDF

Info

Publication number
CN111651486A
CN111651486A CN202010446944.7A CN202010446944A CN111651486A CN 111651486 A CN111651486 A CN 111651486A CN 202010446944 A CN202010446944 A CN 202010446944A CN 111651486 A CN111651486 A CN 111651486A
Authority
CN
China
Prior art keywords
floating point
data
lookup
point search
input data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010446944.7A
Other languages
Chinese (zh)
Other versions
CN111651486B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN202010446944.7A priority Critical patent/CN111651486B/en
Publication of CN111651486A publication Critical patent/CN111651486A/en
Application granted granted Critical
Publication of CN111651486B publication Critical patent/CN111651486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a processing device and a method, wherein the device comprises a data transmission unit and a floating point search unit, wherein the floating point search unit comprises at least one floating point search subunit; the floating point search unit is used for receiving input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are floating point type data; the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point search unit, receiving the floating point search sub-results, and sequencing the floating point search sub-results according to the input data to obtain the floating point search results. The method and the device have the advantages that the lookup table operation is processed in parallel by configuring the at least one floating point lookup subunit, and the operation speed of the lookup table operation is effectively improved.

Description

Processing apparatus and method
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a processing apparatus and method.
Background
In the field of artificial intelligence application, the lookup table technique is a common processing method, for example, in the field of image processing, functions such as enhancing image contrast and adjusting color difference can be realized by mapping the colors of an image and changing the color distribution of the image. At present, a lookup table technology establishes a mapping relation between an index number and an output value, and the output value is quickly obtained according to the input value and the mapping relation, but when the input value of a lookup table is floating point data, the existing lookup table technology cannot directly search the floating point data, the floating point data needs to be converted into integer data to search to obtain a search result, and then the search result is converted into the floating point data to be output.
Disclosure of Invention
The embodiment of the application provides a processing device and a processing method, which can be used for processing lookup table operation in parallel by configuring at least one floating point lookup subunit, so that the operation speed of the lookup table operation is effectively improved.
In a first aspect, an embodiment of the present application provides a processing apparatus, where the processing apparatus includes a data transmission unit and a floating point lookup unit, where: the floating point search unit comprises at least one floating point search subunit;
the floating point search unit is used for receiving input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are floating point type data;
the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point search unit, receiving the floating point search sub-results, and sorting the floating point search sub-results according to the input data to obtain the floating point search results.
Optionally, the floating point lookup subunit includes: the device comprises a configuration module, a search module and an operation module, wherein:
the configuration module is configured to set configuration information corresponding to each floating point lookup subunit in the floating point lookup unit, where the configuration information includes a configuration table and a lookup table;
the searching module is used for determining the data range of currently searched input data of each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to the corresponding operation module;
and the operation module is used for processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.
Optionally, the configuration table includes: a first parameter, a second parameter, a third parameter, and a fourth parameter;
the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
Optionally, the search module is specifically configured to:
according to the first parameter and the second parameter, determining the number of segments of the lookup segment for which the floating point lookup unit executes the lookup table operation and the data range corresponding to each segment for executing the lookup table operation;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit according to the data range.
Optionally, the data transmission unit is further configured to:
and determining whether all data in the input data are subjected to lookup table operation or not according to the floating point lookup result to obtain a verification result.
Optionally, the configuration module is further configured to:
and resetting the configuration information corresponding to each floating point search subunit in the floating point search unit when the verification result indicates that all data in the input data are not subjected to the lookup table operation.
Optionally, the processing apparatus further includes a storage unit; the storage unit is used for storing the input data and the floating point search result.
In a second aspect, an embodiment of the present application provides a processing method, which is applied in a processing apparatus, where the processing apparatus includes a data transmission unit and a floating point lookup unit, where the floating point lookup unit includes at least one floating point lookup subunit, and the method includes:
the data transmission unit receives input data and broadcasts the input data to the floating point searching unit;
the floating point search unit receives the input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are floating point type data;
and the data transmission unit receives the floating point search sub-results, and orders the floating point search sub-results according to the input data to obtain the floating point search results.
Optionally, the method further includes:
setting configuration information corresponding to each floating point searching subunit in the floating point searching unit, wherein the configuration information comprises a configuration table and a searching table;
determining a data range of currently searched input data of each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to a corresponding operation module;
and processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.
Optionally, the configuration table includes: a first parameter, a second parameter, a third parameter, and a fourth parameter;
the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
Optionally, the determining the data to be processed from the input data according to the data range includes:
according to the first parameter and the second parameter, determining the number of segments of the lookup segment for which the floating point lookup unit executes the lookup table operation and the data range corresponding to each segment for executing the lookup table operation;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit according to the data range.
Optionally, the method further includes:
and determining whether all data in the input data are subjected to lookup table operation or not according to the floating point lookup result to obtain a verification result.
Optionally, the method further includes: and resetting the configuration information corresponding to each floating point search subunit in the floating point search unit when the verification result indicates that all data in the input data are not subjected to the lookup table operation.
Optionally, the method further includes: and storing the input data and the floating point search result.
In a third aspect, the present application provides a computer device comprising a processor, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the processor, and the programs include instructions for executing the method according to any one of the second aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium including a computer program stored thereon for data exchange, the computer program, when executed by a processor, implementing some or all of the steps as described in the second aspect of embodiments of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps as described in the second aspect of embodiments of the present application. The computer program product may be a software installation package.
It can be seen that, in the embodiment of the present application, through a data transmission unit and a floating point lookup unit, the floating point lookup unit includes at least one floating point lookup subunit, the floating point lookup unit is configured to receive input data, each floating point lookup subunit in the floating point lookup unit performs a lookup table operation in parallel according to corresponding input data to obtain a corresponding floating point lookup sub-result, and returns the floating point lookup sub-result to the data transmission unit, where the input data is floating point type data; the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point search unit, receiving the floating point search sub-results, and sequencing the floating point search sub-results according to the input data to obtain the floating point search results. The method and the device have the advantages that the lookup table operation is processed in parallel by configuring the at least one floating point lookup subunit, the operation speed of the lookup table operation is effectively improved, the lookup table operation is directly executed on the floating point type data, and the overhead caused by data type conversion is reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a processing apparatus according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of another processing apparatus provided in an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a floating point lookup subunit according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of a mapping relationship of a lookup table according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of another processing apparatus provided in an embodiment of the present application;
fig. 7 is a schematic flowchart of a processing method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of the invention and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure, and as shown in fig. 1, the computer device may include a processor, a memory, and one or more programs, where the one or more programs are stored in the memory and configured to be processed by the processor. The computer device may further include a communication bus, an input device, and an output device, and the processor, the memory, the input device, and the output device may be connected to each other through the bus.
The processor is configured to implement the following steps when executing the program stored in the memory:
the data transmission unit receives input data and broadcasts the input data to the floating point searching unit;
the floating point search unit receives the input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search sub-result, and the floating point search sub-result is returned to the data transmission unit, wherein the input data are integer data;
and the data transmission unit receives the floating point search sub-results and sorts the floating point search sub-results according to the input data to obtain floating point search results.
Further, the processor may be a Central Processing Unit (CPU), an intelligent Processing Unit (NPU), a Graphics Processing Unit (GPU), or an Image Processing Unit (Image Processing Unit), which is not limited in this application. According to different processors, the processing method provided by the embodiment of the application can be applied to the artificial intelligence application fields such as image recognition processing, deep learning processing, computer vision processing, intelligent robot processing, natural language processing and the like, and complex function programs in the artificial intelligence field are executed, for example, in the aspect of image recognition processing, an image processor can increase the contrast and brightness of an image through the operation of a lookup table; in terms of computer vision processing, the processor may adjust the color difference of the image on the display screen through a look-up table operation.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a processing apparatus 200 according to an embodiment of the present disclosure, where the apparatus 200 is applied to the computer device shown in fig. 1. As shown in fig. 2, the apparatus 200 includes: the data transmission unit 21, the floating point search unit 22 includes at least one floating point search subunit 220;
the floating point search unit 22 is configured to receive input data, each floating point search subunit 220 in the floating point search unit 22 performs a lookup table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subunit result, and returns the floating point search subunit result to the data transmission unit 21, where the input data is floating point type data;
the data transmission unit 21 is configured to receive the input data, broadcast the input data to the floating point search unit 22, receive the floating point search sub-results, and sort the floating point search sub-results according to the input data to obtain floating point search results.
Specifically, the floating point lookup unit 22 includes one or more floating point lookup sub-units 220, which may be a floating point lookup sub-unit 1, a floating point lookup sub-unit 2, … …, and a floating point lookup sub-unit N, respectively, as shown in fig. 2, where N is a positive integer, and each floating point lookup sub-unit 220 is configured to implement a lookup table function of partial data. Upon receiving the input data, the data transfer unit 21 broadcasts the input data to each floating point lookup sub-unit 220 of the floating point lookup unit 22 to perform the lookup table operation. Each floating point search subunit 220 receives input data, performs a lookup table operation only on data in the data range of the current floating point search subunit 220 in the input data, and each floating point search subunit 220 returns an output floating point search sub-result to the data transmission unit 21. And the data transmission unit converges data according to the storage position of the input data and writes the floating point search sub-result back to the corresponding storage position of the input data.
For example, as shown in fig. 3, the input data is abcdefgh, and the floating point lookup unit 22 includes a floating point lookup subunit 1, a floating point lookup subunit 2, and a floating point lookup subunit 3. After receiving abcdefgh, the data transmission unit 21 broadcasts it to the floating point lookup subunit 1, the floating point lookup subunit 2, and the floating point lookup subunit 3. After receiving abcdefgh, the floating point lookup subunit 1 performs lookup table operation on only ac data in the input data to obtain a floating point lookup sub-result a 'c'; after receiving abcdefgh, the floating point lookup subunit 2 performs lookup table operation only on deg data in the input data to obtain a floating point lookup sub-result d ' e ' g '; after receiving abcdefgh, the floating point lookup subunit 3 performs lookup table operation only on bf data in the input data to obtain a floating point lookup sub-result b 'f'. Then the floating point lookup sub-unit 1, the floating point lookup sub-unit 2, and the floating point lookup sub-unit 3 respectively send the floating point lookup sub-results to the data transmission unit 21. After receiving the floating point search sub-results, the data transmission unit 21 performs aggregation and sorting according to the storage location of the input data, and obtains the floating point search results a ' b ' c'd ' e ' f ' g '.
Further, one or more floating point lookup sub-units 220 included in the floating point lookup unit 22 may adjust the number of the floating point lookup sub-units 221 according to design requirements of hardware, such as hardware area requirements or performance requirements. Each floating point search subunit 220 is responsible for data search table operation within a certain range, and when the data range of input data exceeds the coverage range of all floating point search subunits 220, the configuration information of each floating point search subunit 220 in the floating point search unit 22 can be configured for multiple times, and the floating point search subunits 220 are multiplexed, so that the function of a search table is realized.
In the embodiment of the present application, at least one floating point lookup subunit 220 is configured to process the lookup table operations in parallel, so that each floating point lookup subunit 220 can process part of the data in the input data in parallel, the lookup table operation period of the input data is effectively shortened, and the operation speed of the lookup table operation is increased.
Optionally, as shown in fig. 4, fig. 4 is a schematic structural diagram of a floating point lookup subunit 220 provided in an embodiment of the present application, and as shown in fig. 4, the floating point lookup subunit 220 includes: a configuration module 221, a lookup module 222, and an operation module 223, wherein:
the configuration module 221 is configured to set configuration information corresponding to each floating point lookup subunit 220 in the floating point lookup unit 22, where the configuration information includes a configuration table and a lookup table;
the search module 222 is configured to determine, according to the configuration information, a data range of currently searched input data of each floating point search subunit 220 in the floating point search unit 22, determine, according to the data range of currently searched input data of each floating point search subunit 220 in the floating point search unit 22, to-be-processed data from the input data, and send the to-be-processed data to the corresponding operation module 223;
the operation module 223 is configured to process the data to be processed according to the mapping relationship of the lookup table, and obtain a mapping value corresponding to the data to be processed, where the mapping value is the floating point lookup sub-result.
In this embodiment, the configuration module 221 configures a corresponding configuration table and a corresponding lookup table for each floating point lookup subunit 220 according to the functional requirements. After each floating point lookup subunit 220 in the floating point lookup unit 22 receives the input data broadcast by the data transmission unit 21, according to the received input data, the configuration table, and the lookup table, a lookup table operation is performed on the input data within the processing range of the lookup table of the current floating point lookup subunit 220.
Wherein, the configuration table includes: a first parameter, a second parameter, a third parameter, and a fourth parameter. Wherein the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
In particular, the first parameter may be represented by a base, which may determine a size of each lookup table segment, in particular a size of 2 for each lookup table segmentbase. The second parameter may be represented by a base _ index, which may determine the index range of the lookup table, and thus may determine the data range of the current floating point lookup subunit 220 performing the lookup table operation through the base and the base _ index, where the specific data range may be represented as [ ± 2 [ ± ]base+baseindex-1,±2base +baseindex]. The third parameter can be represented by an index, the index can determine a specific lookup table segment, the positive and negative of the index can indicate that the lookup table segment is located in a positive half shaft or a negative half shaft, and the index is positive and indicates that the lookup table is located in the positive half shaft; index is negative, indicating that the look-up table is at the negative half axis. The fourth parameter may be denoted by k, which may determine the number of segments within each lookup table segment. The lookup table is used for providing a mapping relation of the lookup table, the segments in each lookup table segment correspond to one numerical value mapping respectively, and each lookup table segment can correspond to k numerical values. For example, if base is 2, base _ index is 2, index is 1, and k is 2, then it can be determined by base that each lookup table section size is 22I.e. 4, the range of indices of the lookup table can be determined by base and base _ index as [ + -2 [ + -3,±24]I.e. the data range of the lookup table is [ + -8, + -16 [ + -8 [ + -16 ]]And each lookup table segment size of the data range is 4, the lookup table contains 4 lookup table segments, respectively [ -16, -12],[-12,-8],[8,12],[12,16]The specific lookup table is segmented into [8,12 ] by index]Determining that the lookup table segment comprises 2 segments, each segment being [8,10 ] respectively],[10,12]。
In the embodiment of the present application, by configuring different configuration tables and lookup tables for each floating point lookup subunit 220, each floating point lookup subunit 220 can implement the lookup table function of different parts of data, and by configuring a plurality of floating point lookup subunits 220, the configuration can be flexibly selected in terms of performance and hardware area overhead, thereby simplifying the design of hardware.
Optionally, the search module 222 is specifically configured to:
determining the number of segments of the lookup segment for which the floating point lookup unit 22 performs the lookup table operation and the data range corresponding to each segment for performing the lookup table operation according to the first parameter and the second parameter;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit 220 according to the data range.
Specifically, each floating point lookup subunit 220 includes a configuration module 221, each configuration module 221 includes a configuration table, and the number of segments of a lookup segment where each floating point lookup subunit 220 performs a lookup table operation and the data range where each segment corresponds to the lookup table operation can be determined by a first parameter and a second parameter in the configuration table, so that the configuration module 221 can determine lookup data from the input data according to the data range where each segment corresponds to the lookup table operation, where the lookup data is in the data range where the floating point lookup subunit 220 performs the lookup table operation. The third parameter, the fourth parameter, the number of the segments, the data range corresponding to each segment for performing the lookup table operation, and the data range of the lookup data in the corresponding lookup table segment are determined, and the to-be-processed data of the floating point lookup subunit 220 can be determined according to the data range of the lookup data.
In the embodiment of the application, the lookup table is used for providing the mapping relation of the lookup table in the data range, the segment in each lookup table segment provides a numerical value mapping, and the total number of 2 in one lookup table segmentkA numerical value. The lookup module 222 determines data to be processed within the data range of the floating point lookup subunit 220 from the input data according to the configuration table in the configuration information, and then transmits the data to be processed to the operation module 223, and the operation module 223 obtains a mapping value corresponding to the data to be processed according to the lookup table provided by the lookup module 221 and the mapping relationship of the lookup table, so as to obtain the floating point lookup subunit of the floating point lookup subunit 220And writing the floating point search sub-result back to the data transmission unit 21 through the data path of the search module 222, where the mapping value is the floating point search sub-result.
As shown in fig. 5, fig. 5 is a schematic diagram of a mapping relationship of a lookup table according to an embodiment of the present application. The operation module 223 determines in which data range the current data is according to the configuration information provided by the lookup module 222, and maps the data to be processed into the lookup table data according to the mapping relationship of the lookup table to obtain a mapping value. Specifically, the lookup module 222 compares the exponent bit of the input data with the first parameter to find the input data corresponding to the difference value equal to the second parameter, compares the significant bit of the corresponding input data with the third parameter to find the data to be processed equal to the third parameter, the lookup module 222 transmits the data to be processed to the operation module 223, the operation module 223 determines the segment in the lookup table segment according to the significant bit of the data to be processed, and executes the lookup table operation to obtain the mapping value corresponding to the data to be processed.
In the embodiment of the application, the process of converting floating point type data into integer type data and then converting integer type data into floating point type data after executing the function of the lookup table is avoided by directly performing lookup table operation on the floating point type data, so that the overhead caused by data type conversion is reduced.
Optionally, the data transmission unit 21 is further configured to:
according to the floating point search result, it is determined whether all data in the input data are subjected to the lookup table operation, so as to obtain a verification result, and the verification result is sent to each floating point search subunit 220 in the floating point search unit 22.
Specifically, after receiving the floating point search sub-result sent by each floating point search sub-unit 220, the data transmission unit 21 matches the content of the floating point search sub-result with the input data, determines the position of the input data corresponding to the content of the floating point search sub-result, and stores the content of the floating point search sub-result to the position of the input data to obtain the floating point search result. And determining whether all data in the input data are subjected to the lookup table operation according to whether the vacant positions exist in the floating point lookup result, so as to obtain a verification result. If the floating point search result has a vacant position, the verification result is that all data in the input data are not subjected to the lookup table operation; and if the floating point search result does not have the vacant position, all the data in the verification result input data is subjected to the lookup table operation.
Further, when the verification result is that all the data in the input data are not subjected to the lookup table operation, the data transmission unit 21 may broadcast the input data to each floating point lookup sub-unit 220 again to perform the lookup table operation; the data transmission unit 21 may also broadcast the input data corresponding to the vacant position in the floating point lookup result, i.e. the data that has not been subjected to the lookup table operation, to each floating point lookup sub-unit 220 to perform the lookup table operation.
Optionally, the configuration module 221 is further configured to:
and when the verification result indicates that all the data in the input data are not subjected to the lookup table operation, resetting the configuration information corresponding to each floating point lookup subunit 220 in the floating point lookup unit 22.
In the case that the data range of the input data exceeds the coverage of all the floating point lookup subunits 220, or in the case that the floating point lookup subunits 220 are few (for example, only one floating point lookup subunit 220), performing one lookup table operation on the input data may not cover the data range of all the input data, for example, the input data abcdefgh exceeds the data coverage of the floating point lookup subunits 1, 2, and 3, and therefore h data in the input data is not subjected to the lookup table operation. The configuration module 221 may update the configuration information of each floating point lookup subunit 220 when the verification result indicates that all data in the input data is not subjected to the lookup table operation, and perform the lookup table operation on the same data again.
Optionally, as shown in fig. 6, the apparatus further includes a storage unit 23, where the storage unit 23 is configured to store the input data and the floating point search result.
Specifically, the input data may be stored in a storage unit of the present apparatus, and the data transmission unit 21 may acquire the input data from the storage unit 23, for example, the user may store the input data in a memory provided in a computer as shown in fig. 1. The data transfer unit 21 may also store the obtained floating point search result in the storage and storage unit 23.
It can be seen that, in the processing apparatus according to the embodiment of the present application, through the data transmission unit 21 and the floating point search unit 22, the floating point search unit 22 includes at least one floating point search subunit 220, the floating point search unit 22 is configured to receive input data, each floating point search subunit 220 in the floating point search unit 22 performs a lookup table operation in parallel according to corresponding input data to obtain a corresponding floating point search sub-result, and returns the floating point search sub-result to the data transmission unit 21, where the input data is floating point type data; the data transmission unit 21 is configured to receive the input data, broadcast the input data to the floating point search unit 22, receive the floating point search sub-results, and sort the floating point search sub-results according to the input data to obtain floating point search results. The method and the device have the advantages that the lookup table operation is processed in parallel by configuring the at least one floating point lookup subunit 220, the operation speed of the lookup table operation is effectively improved, the lookup table operation is directly executed on the floating point type data, and the overhead brought by data type conversion is reduced.
For example, when the processing apparatus provided by the present application performs image recognition processing, the data transmission unit 21 receives image recognition input data, broadcasts the image recognition input data to the floating point search unit 22, the floating point search unit 22 receives the image recognition input data, each floating point search subunit 220 in the floating point search unit 22 performs a lookup table operation in parallel according to the corresponding image recognition input data to obtain a corresponding floating point search subunit result, and returns the floating point search subunit result to the data transmission unit 21, where the image recognition input data is floating point type data; the data transmission unit 21 receives the floating point lookup sub-results, orders the floating point lookup sub-results according to the image recognition input data to obtain the floating point lookup results, and configures at least one floating point lookup sub-unit 220 to process the lookup table operation in parallel, thereby effectively increasing the operation speed of the lookup table operation, and further improving the operation efficiency of the image recognition system.
Further, when the processing apparatus provided by the present application performs deep learning, the data transmission unit 21 receives deep learning input data, broadcasts the deep learning input data to the floating point search unit 22, the floating point search unit 22 receives the deep learning input data, each floating point search subunit 220 in the floating point search unit 22 performs a lookup table operation in parallel according to the corresponding deep learning input data to obtain a corresponding floating point search subunit result, and returns the floating point search subunit result to the data transmission unit 21, where the deep learning input data are floating point type data; the data transmission unit 21 receives the floating point lookup sub-results, orders the floating point lookup sub-results according to the deep learning input data to obtain the floating point lookup results, and configures at least one floating point lookup sub-unit 220 to process the lookup table operation in parallel, thereby effectively increasing the operation speed of the lookup table operation, and further improving the operation efficiency of the deep learning system.
Referring to fig. 7, fig. 7 is a flowchart illustrating a processing method according to an embodiment of the present disclosure, applied in the processing apparatus shown in fig. 2, where the processing apparatus includes a data transmission unit and a floating point lookup unit, and the floating point lookup unit includes at least one floating point lookup subunit. As shown in fig. 7, the method includes the steps of:
s710, the data transmission unit receives input data and broadcasts the input data to the floating point searching unit;
s720, the floating point search unit receives the input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are integer data;
and S730, the data transmission unit receives the floating point search sub-results, and the floating point search sub-results are sequenced according to the input data to obtain the floating point search results.
Optionally, the method further includes:
setting configuration information corresponding to each floating point searching subunit in the floating point searching unit, wherein the configuration information comprises a configuration table and a searching table;
determining a data range of currently searched input data of each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to a corresponding operation module;
and processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.
Optionally, the configuration table includes: a first parameter, a second parameter, a third parameter, and a fourth parameter;
the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
Optionally, the determining the data to be processed from the input data according to the data range includes:
according to the first parameter and the second parameter, determining the number of segments of the lookup segment for which the floating point lookup unit executes the lookup table operation and the data range corresponding to each segment for executing the lookup table operation;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit according to the data range.
Optionally, the method further includes:
and determining whether all data in the input data are subjected to lookup table operation or not according to the floating point lookup result to obtain a verification result.
Optionally, the method further includes: and resetting the configuration information corresponding to each floating point search subunit in the floating point search unit when the verification result indicates that all data in the input data are not subjected to the lookup table operation.
Optionally, the method further includes: and storing the input data and the floating point search result.
It can be understood that the specific implementation manner of the processing method according to the embodiment of the present application may be according to the specific implementation manner in the embodiment of the processing apparatus, and the specific implementation process may refer to the description related to the embodiment of the apparatus, which is not described herein again.
Embodiments of the present application also provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any one of the methods as described in the above method embodiments.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a terminal device, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (16)

1. A processing apparatus, comprising a data transfer unit, a floating point lookup unit, wherein: the floating point search unit comprises at least one floating point search subunit;
the floating point search unit is used for receiving input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are floating point type data;
the data transmission unit is used for receiving the input data, broadcasting the input data to the floating point search unit, receiving the floating point search sub-results, and sorting the floating point search sub-results according to the input data to obtain the floating point search results.
2. The apparatus of claim 1, wherein the floating point lookup subunit comprises: the device comprises a configuration module, a search module and an operation module, wherein:
the configuration module is configured to set configuration information corresponding to each floating point lookup subunit in the floating point lookup unit, where the configuration information includes a configuration table and a lookup table;
the searching module is used for determining the data range of currently searched input data of each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to the corresponding operation module;
and the operation module is used for processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.
3. The apparatus of claim 2, wherein the configuration table comprises: a first parameter, a second parameter, a third parameter, and a fourth parameter;
the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
4. The apparatus of claim 3, wherein the lookup module is specifically configured to:
according to the first parameter and the second parameter, determining the number of segments of the lookup segment for which the floating point lookup unit executes the lookup table operation and the data range corresponding to each segment for executing the lookup table operation;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit according to the data range.
5. The apparatus of claim 2, wherein the data transmission unit is further configured to:
and determining whether all data in the input data are subjected to lookup table operation or not according to the floating point lookup result to obtain a verification result.
6. The apparatus of claim 5, wherein the configuration module is further configured to:
and resetting the configuration information corresponding to each floating point search subunit in the floating point search unit when the verification result indicates that all data in the input data are not subjected to the lookup table operation.
7. The apparatus according to any one of claims 1-6, wherein the processing means further comprises a memory unit; the storage unit is used for storing the input data and the floating point search result.
8. A processing method applied to a processing device, wherein the processing device comprises a data transmission unit and a floating point search unit, the floating point search unit comprises at least one floating point search subunit, and the method comprises the following steps:
the data transmission unit receives input data and broadcasts the input data to the floating point searching unit;
the floating point search unit receives the input data, each floating point search subunit in the floating point search unit executes a search table operation in parallel according to the corresponding input data to obtain a corresponding floating point search subresult, and the floating point search subresult is returned to the data transmission unit, wherein the input data are floating point type data;
and the data transmission unit receives the floating point search sub-results, and orders the floating point search sub-results according to the input data to obtain the floating point search results.
9. The method of claim 8, further comprising:
setting configuration information corresponding to each floating point searching subunit in the floating point searching unit, wherein the configuration information comprises a configuration table and a searching table;
determining a data range of currently searched input data of each floating point searching subunit in the floating point searching unit according to the configuration information, determining data to be processed from the input data according to the data range, and sending the data to be processed to a corresponding operation module;
and processing the data to be processed according to the mapping relation of the lookup table to obtain a mapping value corresponding to the data to be processed, wherein the mapping value is the floating point lookup sub-result.
10. The method of claim 9, wherein the configuration table comprises: a first parameter, a second parameter, a third parameter, and a fourth parameter;
the first parameter is used to determine a size of a lookup table segment, the second parameter is used to determine an exponential range of the lookup table, the third parameter is used to determine a lookup table segment, and the fourth parameter is used to determine a number of segments of a lookup table segment.
11. The method of claim 10, wherein determining the data to be processed from the input data according to the data range comprises:
according to the first parameter and the second parameter, determining the number of segments of the lookup segment for which the floating point lookup unit executes the lookup table operation and the data range corresponding to each segment for executing the lookup table operation;
determining the data range according to the third parameter, the fourth parameter, the number of the segments and the data range corresponding to each segment and used for executing the lookup table operation;
and determining the data to be processed of the floating point search subunit according to the data range.
12. The method of claim 9, further comprising:
and determining whether all data in the input data are subjected to lookup table operation or not according to the floating point lookup result to obtain a verification result.
13. The method of claim 12, further comprising:
and resetting the configuration information corresponding to each floating point search subunit in the floating point search unit when the verification result indicates that all data in the input data are not subjected to the lookup table operation.
14. The method according to any one of claims 8-13, further comprising:
and storing the input data and the floating point search result.
15. A computer device, characterized in that the computer device comprises a processor, a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the processor, the program comprising instructions for performing the steps in the method of any of claims 8-14.
16. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a computer program stored for data exchange, which computer program, when being executed by a processor, carries out the method according to any one of claims 8-14.
CN202010446944.7A 2020-05-25 2020-05-25 Processing device and method Active CN111651486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010446944.7A CN111651486B (en) 2020-05-25 2020-05-25 Processing device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010446944.7A CN111651486B (en) 2020-05-25 2020-05-25 Processing device and method

Publications (2)

Publication Number Publication Date
CN111651486A true CN111651486A (en) 2020-09-11
CN111651486B CN111651486B (en) 2023-06-27

Family

ID=72343337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010446944.7A Active CN111651486B (en) 2020-05-25 2020-05-25 Processing device and method

Country Status (1)

Country Link
CN (1) CN111651486B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598197A (en) * 2015-01-26 2015-05-06 中国科学院自动化研究所 Operation method for reciprocal value and/or reciprocal square root of floating-point number and operation device
CN109976808A (en) * 2017-12-26 2019-07-05 三星电子株式会社 The method and system and memory die of memory look-up mechanism
US20190212980A1 (en) * 2018-01-09 2019-07-11 Samsung Electronics Co., Ltd. Computing accelerator using a lookup table

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598197A (en) * 2015-01-26 2015-05-06 中国科学院自动化研究所 Operation method for reciprocal value and/or reciprocal square root of floating-point number and operation device
CN109976808A (en) * 2017-12-26 2019-07-05 三星电子株式会社 The method and system and memory die of memory look-up mechanism
US20190212980A1 (en) * 2018-01-09 2019-07-11 Samsung Electronics Co., Ltd. Computing accelerator using a lookup table
CN110032708A (en) * 2018-01-09 2019-07-19 三星电子株式会社 The method for calculating the method and system of product and calculating dot product and calculating convolution

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
晏敏;何欣;李沙;祝龙;赵丽;: "基于一阶泰勒级数查表法单精度倒数的设计与实现" *

Also Published As

Publication number Publication date
CN111651486B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
US20200409996A1 (en) Video abstract generating method, apparatus, and storage medium
CN105740405B (en) Method and device for storing data
US12056087B2 (en) Data transmission method, electronic device, and computer-readable storage medium
CN114285906A (en) Message processing method and device, electronic equipment and storage medium
CN111858865A (en) Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN112070144B (en) Image clustering method, device, electronic equipment and storage medium
CN111651486A (en) Processing apparatus and method
CN116721007B (en) Task control method, system and device, electronic equipment and storage medium
CN109587422A (en) A kind of method, apparatus and terminal device demodulating Digital Radio TV signal
CN116467235B (en) DMA-based data processing method and device, electronic equipment and medium
CN111651487A (en) Processing apparatus and method
US11934837B2 (en) Single instruction multiple data SIMD instruction generation and processing method and related device
CN108427671B (en) Information conversion method and apparatus, storage medium, and electronic apparatus
CN105184372A (en) Knowledge network construction method and apparatus
CN111258733B (en) Embedded OS task scheduling method and device, terminal equipment and storage medium
CN115454923A (en) Data calculation device, board card, method and storage medium
CN111178373B (en) Operation method, device and related product
CN110647355B (en) Data processor and data processing method
CN112307758B (en) Method, device, electronic equipment and storage medium for recommending keywords through root words
CN112445933A (en) Model training method, device, equipment and storage medium
CN115203598B (en) Information ordering method in real estate field, electronic equipment and storage medium
CN110781227B (en) Information processing method and device
CN113127611B (en) Method, device and storage medium for processing question corpus
CN116701616B (en) Text classification method and electronic equipment
CN111274228B (en) Policy data migration storage method, system, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant