CN108549933A

CN108549933A - A kind of data processing method, device, electronic equipment and computer-readable medium

Info

Publication number: CN108549933A
Application number: CN201810368872.1A
Authority: CN
Inventors: 周舒畅; 胡晨
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2018-04-23
Filing date: 2018-04-23
Publication date: 2018-09-18

Abstract

The present invention provides a kind of data processing method, device, electronic equipment and computer-readable medium, this method includes：Obtain data to be calculated, wherein the quantity of data to be calculated is multiple, and each bit wide to be calculated for calculating data is less than or equal to N, and N is the positive integer more than zero；Additional calculation is carried out to data to be calculated by first adder and low level abandons operation, exporting high M results in additional calculation result for target result of calculation, wherein the bit wide of additional calculation result is more than M, the bit wide of first adder is N, and M is the positive integer less than or equal to N.The present invention effectively reduces the consumption of the hardware logic resource when carrying out additional calculation by hardware.

Description

A kind of data processing method, device, electronic equipment and computer-readable medium

Technical field

The present invention relates to the technical fields of data processing, more particularly, to a kind of data processing method, device, electronic equipment And computer-readable medium.

Background technology

With the fast development of artificial intelligence technology, data processing technique has magnanimity in image, voice and word etc. The field of data has had extensively and has successfully applied, and is based particularly on the data processing technique of artificial intelligence in image, language The fields such as sound are applied very extensive, for example, neural network algorithm.The more multiply-add calculating of core in traditional neural network, Double precision or single precision floating datum is usually used in its input/output format.By taking image as an example, general algorithm framework is as follows：It is first First, according to image processing tasks by the image zooming-out of input at the matrix form of higher-dimension tensor, and as floating-point input be passed to In neural network；Then by the floating-point convolution operation of core in neural network, each calculated in neural network calculates section Point, and floating point calculations are constantly transmitted until next layer；Next, according to the label information of input, predicting unit is obtained The weighting parameter of network is updated using back-propagating with the error of label information, forward-backward algorithm operation is constantly repeated, until whole A network is fitted to the accuracy rate of requirement；When using network model, to trained network inputs picture to be detected, forward direction passes It broadcasts to obtain final label information.

When being calculated above-mentioned algorithm framework by hardware, huge computational requirements, EMS memory occupation are often led to The problems such as demand and high bandwidth require, to which hardware realization Large Scale Neural Networks are proposed with very high requirement.

Invention content

It can in view of this, the purpose of the present invention is to provide a kind of data processing method, device, electronic equipment and computers Medium is read, the consumption by hardware logic resource when hardware progress add operation is effectively reduced.

In a first aspect, an embodiment of the present invention provides a kind of data processing methods, including：Obtain data to be calculated, wherein The quantity of the data to be calculated is multiple, and each bit wide to be calculated for calculating data is less than or equal to N, and N is more than zero Positive integer；Additional calculation is carried out to the data to be calculated by first adder and low level abandons operation, by additional calculation As a result in high M results output be target result of calculation, wherein the bit wide of the additional calculation result be more than M, described first The bit wide of adder is N, and M is the positive integer less than or equal to N.

Further, described that additional calculation is carried out to the data to be calculated by the first adder, and by addition High M results output in result of calculation is that target result of calculation includes：If the bit wide of the data to be calculated be more than or Equal to the first default bit wide, then additional calculation is carried out to the data to be calculated by the first adder, and by addition meter The high M results output calculated in result is the target result of calculation.

Further, the method further includes：If the bit wide of the data to be calculated is less than or equal to the second default position Width then carries out additional calculation to the data to be calculated by second adder, and the additional calculation result is exported.

Further, the acquisition data to be calculated include：Obtain the number to be calculated of the first computation layer in add tree According to, wherein first computation layer is any one computation layer in the add tree；It is described by first adder to described Data to be calculated carry out additional calculation, and include by the high M results output in additional calculation result：Pass through first addition Device carries out additional calculation to the data to be calculated of first computation layer and low level abandons, by the addition of first computation layer High M results output in result of calculation is the target result of calculation of first computation layer.

Further, the method further includes：When the input data to the second computation layer carries out additional calculation, institute is called State first adder；Additional calculation and low level are carried out to the data to be calculated of second computation layer using the first adder It abandons, by the target that the high M results output in the additional calculation result of second computation layer is second computation layer Result of calculation, wherein second computation layer is other computation layers being located in the add tree after first computation layer.

Further, first computation layer includes multigroup data to be calculated, it is described by the first adder to institute The data to be calculated for stating the first computation layer carry out additional calculation, and by the high M in the additional calculation result of first computation layer Position result, which exports, includes：It determines a first adder, and is utilized respectively one first adder to described in every group Input data carries out additional calculation and low level abandons operation, and high M results in each additional calculation result are exported； Alternatively, determining a first adder for data to be calculated described in every group, multiple first adders are obtained, and utilize Each first adder carries out additional calculation to corresponding data to be calculated and low level abandons operation, in terms of by the addition Calculate the high M results output in result.

Further, further include：In the case where having built the first adder, then described first built is called Adder；In the case of the unstructured first adder, then the first adder is built.

Second aspect, an embodiment of the present invention provides a kind of data processing equipments, including：Acquiring unit is waited for for obtaining Calculate data, wherein the quantity of the data to be calculated is multiple, and each bit wide to be calculated for calculating data is less than or equal to N, N are the positive integer more than zero；Computing unit, for by first adder to the data to be calculated carry out additional calculation and Low level abandons operation, is target result of calculation by the high M results output in additional calculation result, wherein the addition meter The bit wide for calculating result is more than M, and the bit wide of the first adder is N, and M is the positive integer less than or equal to N.

The third aspect an embodiment of the present invention provides a kind of electronic equipment, including memory, processor and is stored in described On memory and the computer program that can run on the processor, the processor are realized when executing the computer program The step of method described above.

Fourth aspect, an embodiment of the present invention provides a kind of meters for the non-volatile program code that can perform with processor The step of calculation machine readable medium, said program code makes the processor execute method described above.

In embodiments of the present invention, first, data to be calculated are obtained, wherein the quantity of data to be calculated be it is multiple, each The bit wide to be calculated for calculating data is less than or equal to N, and N is the positive integer more than zero；Then, meter is treated by first adder It counts according to additional calculation and low level discarding operation is carried out, is that target calculates by the high M results output in additional calculation result As a result, wherein the bit wide of additional calculation result is more than M, and the bit wide of first adder is N, and M is the positive integer less than or equal to N.

In embodiments of the present invention, above-mentioned first adder is properly termed as abandoning the adder of low level again, when number to be calculated According to additional calculation result be more than some bit wide when (for example, M) when, by the adder to data to be calculated carry out additional calculation It abandons and operates with low level, the result of calculation of data to be calculated can be limited within the scope of some bit wide (for example, being limited to M In wide scope), be conducive to avoid the adder in follow-up calculate using more high-bit width, it is continuous compared to being needed in conventional method The adder of more high-bit width is used, the consumption to hardware logic resource can be reduced, to save a large amount of logical resource, in turn Alleviate the hardware logic resource consumption larger technical problem existing in the prior art when carrying out add operation by hardware.

Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification It obtains it is clear that understand through the implementation of the invention.The purpose of the present invention and other advantages are in specification, claims And specifically noted structure is realized and is obtained in attached drawing.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.

Description of the drawings

It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art are briefly described, it should be apparent that, in being described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, other drawings may also be obtained based on these drawings.

Fig. 1 is the schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention；

Fig. 2 is a kind of flow chart of data processing method according to the ... of the embodiment of the present invention；

Fig. 3 is a kind of schematic diagram of 8 bit wide adder in the prior art；

Fig. 4 is a kind of calculating structural schematic diagram of add tree in the prior art；

Fig. 5 is a kind of structural schematic diagram of first adder according to the ... of the embodiment of the present invention；

Fig. 6 is the calculating structural schematic diagram of the first add tree according to the ... of the embodiment of the present invention；

Fig. 7 is the calculating structural schematic diagram of second of add tree according to the ... of the embodiment of the present invention；

Fig. 8 is the calculating structural schematic diagram of the third add tree according to the ... of the embodiment of the present invention；

Fig. 9 is the calculating structural schematic diagram of the 4th kind of add tree according to the ... of the embodiment of the present invention；

Figure 10 is the calculating structural schematic diagram of the 5th kind of add tree according to the ... of the embodiment of the present invention；

Figure 11 is a kind of schematic diagram of data processing equipment according to the ... of the embodiment of the present invention.

Specific implementation mode

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, shall fall within the protection scope of the present invention.

First, a kind of electronic equipment 100 for realizing the embodiment of the present invention is described referring to Fig.1.

As shown in Figure 1, electronic equipment 100 includes one or more processors 102, one or more memories 104.It is optional Ground, electronic equipment 100 can also include input unit 106, output device 108 and data collector 110, these components pass through Bindiny mechanism's (not shown) interconnection of bus system 112 and/or other forms.It should be noted that electronic equipment 100 shown in FIG. 1 Component and structure be illustrative, and not restrictive, as needed, the electronic equipment can also have other assemblies And structure.

Digital signal processor (Digital Signal Processing, abbreviation may be used in the processor 102 DSP), field programmable gate array (Field-Programmable Gate Array, abbreviation FPGA), programmable logic array (Programmable logic arrays, abbreviation PLA), application-specific IC (Application Specific Integrated Circuit, abbreviation ASIC), arm processor (Advanced RISC Machine, abbreviation ARM), in extremely A kind of few example, in hardware realizes, the processor 102 can be central processing unit (Central Processing Unit, CPU), graphics processing unit (Graphics Processing Unit, GPU) or have data-handling capacity and/or instruction The processing unit of the other forms of executive capability, and other components in the electronic equipment 100 can be controlled to execute the phase The function of prestige.

The memory 104 may include one or more computer program products, and the computer program product can be with Including various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described volatile Property memory is such as may include random access memory (RAM) and/or cache memory (cache).It is described non-easy The property lost memory is such as may include read-only memory (ROM), hard disk, flash memory.On the computer readable storage medium One or more computer program instructions can be stored, processor 102 can run described program instruction, described below to realize The embodiment of the present invention in the client functionality (realized by processor) and/or other desired functions.In the calculating Various application programs and various data can also be stored in machine readable storage medium storing program for executing, such as the application program uses and/or production Raw various data etc..

The input unit 106 can be the device that user is used for inputting instruction, and may include keyboard, mouse, wheat One or more of gram wind and touch screen etc..

The output device 108 can export various information (for example, image or sound) to external (for example, user), and And may include one or more of display, loud speaker etc..

The data collector 110 is for carrying out data acquisition, wherein the data that data collector is acquired are for inputting To carrying out operation in the neural network, for example, data collector can shoot the desired image of user (such as photo, video Deng), then, which is input in the neural network and carries out operation, data collector can also be by captured image It is stored in the memory 104 so that other components use.

Illustratively, it may be implemented as such as video camera for realizing electronic equipment according to the ... of the embodiment of the present invention, grab The intelligent terminals such as bat machine, smart mobile phone, tablet computer.

When carrying out additional calculation on FPGA, DSP, ARM, the hardware such as ASIC, if the quantity of additional calculation is more, It needs to consume a large amount of hardware logic resources on hardware.Especially when using left-hand adder, it can also consume a large amount of hardware and patrol Collect resource.For example, the hardware logic resource that 8 adders are consumed just is significantly larger than the hardware logic that 4 adders are consumed Resource.

If in FPGA, DSP, ARM, when realizing the algorithm of neural network on the hardware such as ASIC, then will consumption it is a large amount of hard Part logical resource.In neural network, convolution is as one of maximum module of calculating operation in entire neural network, in convolution Add operation also takes up very big ratio in the operation of neural network.It is to god first in existing calculation Quantification treatment is carried out through network, so that the parameter that low-bit width inputs in convolution operation of the neural network after quantization passes through multiplication Device carries out obtaining the value of high-bit width after multiplication calculates.It needs to add the value of high-bit width by the adder of high-bit width at this time Method operates.For example, it (wherein includes a carry that two N-bit numbers export (N+1) bit number using a N-bit adder Position).Therefore with the progress of calculating, the adder of more high-bit width can constantly be used.The adder of high-bit width can consume largely Hardware logic resource.For example, if when carrying out logical operation on FPGA, it is a large amount of that one 8 adders can consume FPGA Logical resource.Since the logic circuit of FPGA is limited, the logic of FPGA can be caused by existing calculation A large amount of wastes of resource.A kind of adder that can abandon low level is proposed based on this, in the embodiment of the present invention, the adder energy It is enough that result of calculation is limited to some bit wide range, be conducive to avoid the adder in follow-up calculate using more high-bit width.

It should be noted that in embodiments of the present invention, high-bit width and low-bit width are relative concepts, can be according to actual Demand determines the specific bit wide of high-bit width and low-bit width meaning, and the embodiment of the present invention not limits this.In one case, High-bit width refers to the data that bit wide is more than or equal to 8, for example, 11101110 be high-bit width data；Low-bit width refers to position The wide data for being less than 8, for example, 1110 be low-bit width data.In another case, high-bit width refer to bit wide be more than or Data equal to 16, for example, 1110111000011010 be high-bit width data；Low-bit width refers to the number that bit wide is less than 16 According to for example, 11100001 be low-bit width data.

In the following embodiments of the present invention, 8 data, low-bit width number are greater than using high-bit width data as bit wide It is introduced according to the data instance for bit wide less than 8.

According to embodiments of the present invention, a kind of embodiment of data processing method is provided, it should be noted that in attached drawing The step of flow illustrates can execute in the computer system of such as a group of computer-executable instructions, although also, Logical order is shown in flow chart, but in some cases, it can be to execute shown different from sequence herein or retouch The step of stating.

Fig. 2 is a kind of flow chart of data processing method according to the ... of the embodiment of the present invention, as shown in Fig. 2, this method includes Following steps：

Step S202 obtains data to be calculated, wherein the quantity of data to be calculated is multiple, each calculating number to be calculated According to bit wide be less than or equal to N, N is positive integer more than zero；

Step S204 carries out additional calculation to data to be calculated by first adder and low level abandons operation, will add High M results output in method result of calculation is target result of calculation, wherein the bit wide of additional calculation result is more than M, and first adds The bit wide of musical instruments used in a Buddhist or Taoist mass is N, and M is the positive integer less than or equal to N.

Optionally, as M=N, additional calculation is carried out to data to be calculated by first adder and low level abandons operation, To be that target result of calculation includes by the high M results output in additional calculation result：By first adder to be calculated When data carry out additional calculation, if the bit wide of additional calculation result is more than M (or N), additional calculation result is carried out low Position abandons operation, is target result of calculation by the high M results output in additional calculation result.

As M=N, when carrying out additional calculation to data to be calculated by first adder, if additional calculation result Bit wide be equal to M (or N), then directly additional calculation result is exported, at this time not to the additional calculation result carry out low level lose Abandon operation.

Optionally, work as M<When N, additional calculation is carried out to data to be calculated by first adder and low level abandons operation, To be that target result of calculation includes by the high M results output in additional calculation result：By first adder to be calculated When data carry out additional calculation, if the bit wide of additional calculation result is more than M (or N), additional calculation result is carried out low Position abandons operation, is target result of calculation by the high M results output in additional calculation result.

It should be noted that in the present embodiment, preferably the additional calculation refers to the additional calculation in neural network, example Such as, in neural network convolutional layer additional calculation.In addition to this, when other data processing methods should using hardware logic resource come When realization, the data processing method that the present embodiment proposed equally may be used and carry out additional calculation, in the present embodiment, not to this Make specific limit.

For example, when on the hardware such as FPGA realize neural network algorithm when, especially the algorithm of deep neural network when, As one of maximum module of calculating operation in entire neural network, add operation also takes up non-convolution in the calculating operation Often big ratio.In the neural network module after quantified, in convolution operation, the parameter of low-bit width input is by multiplying The value of high-bit width is obtained after musical instruments used in a Buddhist or Taoist mass, needs to carry out add operation to the value of high-bit width at this time.For example, the value of high-bit width is 8 bit wides Numerical value, then with regard to needing to carry out add operation to the numerical value of 8 bit wide.

In existing computational methods, directly the adder of high-bit width is used to be counted typically on the hardware such as FPGA It calculates, for example, for the numerical value of above-mentioned 8 bit wide, high-bit width adder as shown in Figure 3 (that is, 8 bit wide adders) may be used To carry out add operation to two 8 input datas, wherein the output of the 8 bit wide adder is 8 outputs and a carry Output.But the consumption for the hardware logic resource brought in this way can be very big, because the bit wide of adder is higher, is consumed FPGA hardware resource is also more.Especially in the quantization neural network for being compressed to low-bit width from floating point parameters is completed, volume Each of long-pending output channel is directed to that multiple low-bit width numbers are once carried out cumulative calculating.This calculating is often organized into addition The form of tree, i.e. each two N-bit numbers export (N+1)-bit numbers (wherein, (N+1)-bit using a N-bit adder Number includes a carry).Therefore with the progress of calculating, the adder of more high-bit width can constantly be used.

Specifically, as shown in figure 4, if the data that the data of each of convolution output channel output are N-bit, In computation layer 1 (Stage1), add operation can be carried out by the data of two N-bit of a N-bit adder pair, obtain (N + 1) data of-bit, wherein in the data of (the N+1)-bit include a carry.As shown in figure 4, in computation layer 2 (Stage2) in, it is necessary to carry out add operation to the data of two (N+1)-bit, at this time, it is necessary to which (N+1)-bit adds The data of two (N+1)-bit of musical instruments used in a Buddhist or Taoist mass pair carry out add operation, obtain the data of (N+2)-bit.At computation layer 3 (Stage3) In, it is necessary to add operation is carried out to the data of two (N+2)-bit, at this time, it is necessary to (N+2)-bit adder pair two The data of a (N+2)-bit carry out add operation, obtain the data of (N+3)-bit.By foregoing description it is found that in the add tree In, with the progress of calculating, the adder of more high-bit width can be constantly used, at this point, will slattern big on the hardware such as FPGA Measure hardware logic resource.

Based on this, in embodiments of the present invention, it is proposed that a kind of data processing method, in the method for the data processing, Calculated by using the adder (that is, above-mentioned first adder) that can abandon low level, can avoid with calculating into Row, needs the adder for constantly using more high-bit width.As shown in figure 5, can be by using the adder that can abandon lowest order To carry out add operations to two 4 data, at this point, the adder that lowest order can be abandoned can be understood as to abandon it is minimum 4 adders of position.At this point, carrying out addition in 4 adders pair that can abandon lowest order by using this, two 4 data When operation, the lowest order in additional calculation result will be abandoned, that is, high three output and carry in reservation additional calculation result are defeated Go out.

As shown in Fig. 5 the adder of lowest order is abandoned, it should be noted that in the present embodiment, above-mentioned first adds Musical instruments used in a Buddhist or Taoist mass is not only the adder for abandoning lowest order, can also be abandon last or multidigit in additional calculation result plus Musical instruments used in a Buddhist or Taoist mass.The adder of discarding lowest order as shown in Figure 5, which is to help, understands above-mentioned first adder.

When using the first adder, additional calculation result can be limited to K.For example, as K=4, two are only taken High 4-bit in the 5-bit results that a 4-bit is added is exported.Therefore, other addition meters at this subsequently based on the output In calculation, it will be carried out between 4-bit numbers.Therefore, it is based on which, can accomplish the multiplexing of first adder, to save Realize resource.

If applying the data processing method in neural network, experiment shows the error of above approximately to nerve The influence of the output of network is acceptable.

In an optional embodiment of the present embodiment, the data to be calculated are carried out by the first adder Additional calculation, and be that target result of calculation includes by the high M results output in additional calculation result：

If the bit wide of the data to be calculated is more than or equal to the first default bit wide, pass through the first adder Additional calculation is carried out to the data to be calculated, and is that the target calculates by the high M results output in additional calculation result As a result.

In the present embodiment, the preferably first default bit wide is 4 bit wides, in addition to this it is also an option that the first default bit wide For the bit wide of other numerical value, user can determine the concrete numerical value of the first default bit wide according to the quantity of hardware logic resource, It is not specifically limited in the present embodiment.

For example, after getting data to be calculated, if it is determined that the bit wide for going out data to be calculated is equal to 4bit (that is, the One default bit wide), then additional calculation can be carried out to data to be calculated by first adder, and will be in additional calculation result High 4 results output is target result of calculation.First adder at this time is 4 adders for abandoning low level.

In another example after getting data to be calculated, if it is determined that the bit wide (for example, 5bit) for going out data to be calculated is big In 4bit (that is, first default bit wide), then additional calculation can also be carried out to data to be calculated by first adder, and will add High 5 results output in method result of calculation is target result of calculation, and first adder at this time is 5 additions for abandoning low level Device.

By foregoing description it is found that in the present embodiment, a threshold value (that is, first default bit wide) can be set, in turn, Determine whether to carry out additional calculation using method described in above-mentioned steps S202 and step S204 by the threshold value.First Default bit wide can be set according to actual needs, be not specifically limited in the present embodiment.

In an optional embodiment of the present embodiment, this method further includes：

If the bit wide of the data to be calculated is less than or equal to the second default bit wide, by second adder to described Data to be calculated carry out additional calculation, and the additional calculation result is exported.

In the present embodiment, the numerical value of the second default bit wide is less than or equal to the first default bit wide.That is, the The numerical value of two default bit wides is determined based on the numerical value of the first default bit wide.When the first default bit wide is 4bit, second Default bit wide can be 3bti, alternatively, 2bit, can specifically determine the numerical value of the second default bit wide according to actual needs, this It is not specifically limited in embodiment.

For example, it is assumed that the first default bit wide is 4bit, the second default bit wide is 3bit.Get data to be calculated it Afterwards, if it is determined that the bit wide (for example, 3bit) for going out data to be calculated is equal to 3bit (that is, second default bit wide), then can also lead to It crosses second adder and additional calculation is carried out to data to be calculated, and additional calculation result is exported, second adder at this time is 3 Position adder.

It should be noted that in the present embodiment, setting above-mentioned threshold value (that is, the first default bit wide and second default bit wide) A benefit be, in the case where ensureing that data are normally carried out calculating, as far as possible save hardware logic resource.

In another optional embodiment,

It includes step S2021 that step S202, which obtains data to be calculated,：The described of the first computation layer in add tree is obtained to wait counting Count evidence, wherein first computation layer is any one computation layer in the add tree；

Step S204 carries out additional calculation by first adder to the data to be calculated, and will be in additional calculation result High M results output include step S2041：By the first adder to the data to be calculated of first computation layer into Row additional calculation and low level abandon, and are described by the high M results output in the additional calculation result of first computation layer The target result of calculation of first computation layer.

By foregoing description it is found that the data processing method that the present embodiment is provided can apply the addition meter in add tree During calculation, wherein the add tree is add tree as shown in Figure 4.Figure 4, it is seen that an add tree includes more A computation layer (for example, Stage1, Stage2 and Stage3).

For example, when calculating any one computation layer (that is, first computation layer) in add tree, this can be obtained The data to be calculated of first computation layer.Then, addition meter is carried out by the data to be calculated of the first computation layer of first adder pair It calculates and low level abandons operation, be the first computation layer by the high M results output in the additional calculation result of the first computation layer Target result of calculation.

Above-mentioned steps S2021 and step S2041 are introduced below in conjunction with add tree shown in fig. 6.Add as shown in FIG. 6 In method tree, used adder is above-mentioned first adder, below in conjunction with Fig. 6 to above-mentioned steps S2021 to step S2041 is specifically described.It should be noted that as shown in fig. 6, Stage1, Stage2 and Stage3 are referred to as the addition The computation layer of tree.

In the present embodiment, first, the data to be calculated of first computation layer (that is, Stage1) are obtained, as shown in fig. 6, At this point, the data to be calculated got are the data of 4 groups of N-bit.In the tree structure pair according to add tree as shown in FIG. 6 It, can be by first adder (that is, abandoning the 4 of lowest order when the data to be calculated of one computation layer successively carry out additional calculation Position adder) to carry out additional calculation to the data to be calculated of first computation layer, obtain the first result of calculation A.At this time Result after the discarding low level that one result of calculation A is calculated for the data to be calculated of first computation layer through first adder.

After obtaining first result of calculation A, using the first result of calculation A as second computation layer (that is, Stage2 input data B).When the input data B to Stage2 carries out additional calculation, first adder pair second is utilized The input data B of computation layer Stage2 carries out additional calculation, obtains the first result of calculation B, and the first result of calculation B at this time is defeated Enter data B and the result after abandoning low level is calculated through first adder.

After obtaining first result of calculation B, using the first result of calculation B as third computation layer (that is, Stage3 input data C).When the input data C to third computation layer Stage3 carries out additional calculation, add using first Musical instruments used in a Buddhist or Taoist mass carries out additional calculation to the input data C of third computation layer Stage3, obtains the first result of calculation C, and at this time first Result of calculation C is results of the input data C after discarding low level is calculated in third computation layer Stage3, wherein at this time the One result of calculation C is the final calculation result of add tree.

In one optionally embodiment, step S2041 includes：If the bit wide of the data to be calculated of the first computation layer More than or equal to the first default bit wide, then the data to be calculated of first computation layer are carried out by the first adder Additional calculation and low level abandon, and it is described that high M results in the additional calculation result of first computation layer, which are exported, The target result of calculation of one computation layer.

In another optionally embodiment, if the bit wide of the data to be calculated of the first computation layer is less than or equal to Second default bit wide then carries out additional calculation by the data to be calculated of stating of the first computation layer of second adder pair, and will be described Additional calculation result exports.

It should be noted that in the present embodiment, the first default bit wide and the second default bit wide are previously according to practical need It to be selected, it is 4-bit that can choose the first default bit wide, and it is 5-bit that can also choose the first default bit wide.Position is preset below Wide with the first default bit wide is 4-bit, for the second default bit wide is 3-bit, is had to the present embodiment in conjunction with Fig. 7 to Fig. 9 Body introduction.

It can be seen from figure 7 that in the case, in the number to be calculated of first computation layer Stage1 to the add tree When according to being calculated, it may be determined that it is 3-bit to go out the data bit width to be calculated, is equal to the second default bit wide 3-bit.At this point, sharp Addition is carried out to the data to be calculated of first computation layer Stage1 of 3-bit with 3-bit adders (that is, above-mentioned second adder) It calculates, obtains the second result of calculation, i.e. the 4-bit data that 3-bit adders export in Stage1 in Fig. 7.To the addition When the data to be calculated of second computation layer Stage2 of tree are calculated, it may be determined that go out the data to be calculated (that is, Stage1 The 4-bit data of middle 3-bit adders output) bit wide be 4-bit, be equal to the first default bit wide 3-bit.At this point, utilizing 4- Bit adders (that is, above-mentioned first adder) carry out additional calculation to the data to be calculated of Stage2 and low level abandons operation, with The target result of calculation for being computation layer Stage2 by the high M results output in the additional calculation result of computation layer Stage2.

As can be seen from Figure 8, in the case, calculated in the data to be calculated to first computation layer Stage1 When, it may be determined that the bit wide for going out the data to be calculated of Stage1 is 2-bit, is less than the second default bit wide 3-bit.At this point, utilizing 2-bit adders (that is, above-mentioned second adder) carry out additional calculation to the data to be calculated of Stage1, obtain one group of addition meter It calculates as a result, the 3-bit data that 2-bit adders export in Stage1 in i.e. Fig. 8.Meter is treated in second computation layer Stage2 When the evidence (that is, additional calculation result of first computation layer Stage1) that counts is calculated, it may be determined that go out the data to be calculated Bit wide be 3-bit, be equal to the second default bit wide 3-bit.At this point, right using 3-bit adders (that is, above-mentioned second adder) The carry out additional calculation of the data to be calculated obtains another set additional calculation as a result, 3-bit additions in the i.e. Stage2 of Fig. 8 The 4-bit data of device output.In third computation layer Stage3 to data to be calculated (that is, second computation layer Stage2 adds Method result of calculation) when being calculated, it may be determined that the bit wide for going out the data to be calculated is 4-bit, is equal to the first default bit wide 4- bit.At this point, using 3-bit adders (that is, above-mentioned first adder) to the data to be calculated of Stage3 carry out additional calculation and Low level abandons operation, is computation layer Stage3's by the high M results output in the additional calculation result of computation layer Stage3 Target result of calculation.

It can be seen in figure 9 that in the case, the bit wide of the data to be calculated of the first computation layer is 5-bit, higher than the One default bit wide 4-bit.At this point, it can be seen in figure 9 that each computation layer of add tree, i.e. Stage1, Stage2 and Each computation layer in Stage3 is all made of first adder (that is, abandoning the 5-bit adders of low level) to corresponding to be calculated Data carry out additional calculation, obtain corresponding target result of calculation.In the case, first adder is properly termed as abandoning most again 5 adders of low level.Specific calculating process is identical as the calculating process of above-described embodiment, is no longer described in detail herein.

In yet another embodiment, after getting multiple data to be calculated, if the quantity of multiple data to be calculated For odd number, then tree structure as shown in Figure 10 may be used, additional calculation is successively carried out to multiple data to be calculated.From Figure 10 In as can be seen that can be in one cover data (for example, 000) of last setting of the structure, then, by multiple data to be calculated Be set to even number, at this point, can use above-mentioned Fig. 6 to calculating process shown in Fig. 9 to multiple data to be calculated successively Additional calculation is carried out, is no longer described in detail herein.

It should be noted that the number of multiple data to be calculated is introduced for 8 in above-mentioned Fig. 6 to Figure 10, But in the present embodiment, the number of multiple data to be calculated is not limited, the number of multiple data to be calculated can be arbitrary It is a.

By above-mentioned Fig. 6 to Figure 10 it is found that when the first computation layer is the last one computation layer of add tree, first calculates The target result of calculation of layer is the additional calculation result of add tree；When first computation layer that the first computation layer is add tree When, the data to be calculated of first computation layer are the data being input in add tree.

In yet another alternative embodiment, if the first computation layer includes multigroup data to be calculated, pass through institute It states first adder and additional calculation is carried out to the data to be calculated of first computation layer, and by the addition of first computation layer High M results in result of calculation, which export, includes：

Determine a first adder, and be utilized respectively one first adder successively to every group of data to be calculated into Row additional calculation and low level abandon operation, and high M results in each additional calculation result are exported.

In from Fig. 6 to Figure 10 as can be seen that in first computation layer Stage1 and second computation layer Stage2, wrap Multiple adders are included, at this point, the quantity of data to be calculated is multiple, wherein the input data being input in an adder is One group of data to be calculated.For example, it can be seen that needing the 4- of two discarding lowest orders in second computation layer Stage2 from 7 Bit adders to carry out additional calculation to two groups of data to be calculated.At this point it is possible to determine then a first adder passes through The multiplexing of the first adder to carry out additional calculation to every group of data to be calculated successively.

It should be noted that in embodiments of the present invention, hardware in FPGA can be reduced by the multiplexing of first adder The consumption of resource.In the case, by the comparison of Fig. 6 and Fig. 7 it is found that in the case, high-bit width adder is replaced with While low-bit width adder (that is, first adder), the consumption of hardware resource in FPGA can be reduced, further, is passed through The multiplexing of first adder, additionally it is possible on this basis, the consumption to hardware resource in FPGA is further reduced, at saving A large amount of logical resource in device is managed, and then alleviates the accumulation operation existing in the prior art by hardware to neural network Hardware logic resource consumption larger technique effect when being calculated.

In yet another alternative embodiment, if the first computation layer includes multigroup data to be calculated, pass through institute It states first adder and additional calculation is carried out to the data to be calculated of first computation layer, and by the addition of first computation layer High M results in result of calculation, which export, further includes：

A first adder is determined for every group of data to be calculated, obtains multiple first adders, and using each First adder carries out additional calculation to corresponding data to be calculated and low level abandons operation, by the height in additional calculation result M result outputs.

In from Fig. 6 to Figure 10 as can be seen that in first computation layer Stage1 and second computation layer Stage2, wrap Multiple adders are included, at this point, the quantity of data to be calculated is multiple, wherein the input data being input in an adder is One group of data to be calculated.For example, it can be seen that needing the 4-bit of two discarding low levels in second computation layer Stage2 from 7 Two groups of adder pair data to be calculated carry out additional calculation.At this point it is possible to determine two first adders, then, by this two A first adder carries out additional calculation to corresponding data to be calculated respectively.That is, in the case, for current meter Every group of data to be calculated of layer are calculated, determine a first adder.

In yet another alternative embodiment, this method further includes：

When the input data to the second computation layer carries out additional calculation, the first adder is called；

Additional calculation is carried out to the data to be calculated of second computation layer using the first adder and low level abandons, The target that the high M results output in the additional calculation result of second computation layer is second computation layer is calculated knot Fruit, wherein second computation layer is other computation layers being located in the add tree after first computation layer.

Specifically, if add tree is add tree as shown in Figure 7, in the case, if the first computation layer is second A computation layer Stage2, the second computation layer are third computation layer Stage3.It can be seen from figure 7 that second computation layer The adder (that is, two first adders) of the 4-bit of 2 discarding low levels, third computation layer Stage3 are needed in Stage2 The adder (that is, a first adder) of the middle 4-bit for needing 1 to abandon low level.In the case, computation layer is being determined It, can when the data to be calculated to third computation layer Stage3 carry out additional calculation after first adder in Stage2 To utilize the second computation layer of first adder pair (third computation layer Stage3) determined in second computation layer Stage2 Data to be calculated carry out additional calculation.That is, in the present embodiment, first adder can be multiplexed with cross-layer, lead to Hardware resource in FPGA can further be saved by crossing cross-layer multiplexing.

By foregoing description it is found that in embodiments of the present invention, in a computation layer, first adder may be implemented Multiplexing, between two computation layers, can also realize the multiplexing of first adder.In the case where not influencing calculating speed, lead to Hardware resource in FPGA can further be saved by crossing above-mentioned set-up mode, to save a large amount of logical resource in processor, And then alleviate the money of the hardware logic when being calculated the accumulation operation of neural network by hardware existing in the prior art Source consumes larger technique effect.

Based on process described in above-described embodiment, after getting multiple data to be calculated, according to add tree During tree structure successively carries out additional calculation to multiple data to be calculated, if the first computation layer in the add tree The bit wide of data to be calculated is more than or equal to the first default bit wide, it is determined that first adder, wherein determining the first addition When device, the determination of first adder can be realized by following processes：

In the case where having built the first adder, then the first adder built is called；

In the case of the unstructured first adder, then the first adder is built.

That is, in the present embodiment, if the unstructured first adder in FPGA, can build this first plus Then musical instruments used in a Buddhist or Taoist mass carries out additional calculation to input data based on the first adder of structure and abandons low level operation.If in FPGA In built first adder, can also realize that input data additional calculation and lose by the multiplexing of the first adder Abandon low level operation.

To sum up, in embodiments of the present invention, additional calculation is carried out to data to be calculated by first adder, compared to biography The left-hand adder of system can reduce the consumption to processor hardware logical resource, to save a large amount of logic in processor Resource, and then alleviate the hardware existing in the prior art when being calculated the accumulation operation of neural network by hardware and patrol Collect the larger technique effect of resource consumption.

The present invention also provides a kind of embodiment of data processing equipment, which is mainly used for executing this hair The data processing method that bright embodiment the above is provided makees data processing equipment provided in an embodiment of the present invention have below Body introduction.

Figure 11 is a kind of schematic diagram of data processing equipment according to the ... of the embodiment of the present invention, as shown in figure 11, at the data Device is managed mainly including acquiring unit 10 and computing unit 20, wherein：

Acquiring unit 10, for obtaining data to be calculated, wherein the quantity of the data to be calculated is multiple, is each waited for The bit wide for calculating data is less than or equal to N, and N is the positive integer more than zero；

Computing unit 20, for carrying out additional calculation and low level discarding behaviour to the data to be calculated by first adder Make, is target result of calculation by the high M results output in additional calculation result, wherein the position of the additional calculation result It is wider than M, the bit wide of the first adder is N, and M is the positive integer less than or equal to N.

Optionally, computing unit 20 is used for：If the bit wide of the data to be calculated is more than or equal to the first default position Width then carries out additional calculation by the first adder to the data to be calculated, and by high M in additional calculation result As a result output is the target result of calculation.

Optionally, which is additionally operable to：If the bit wide of the data to be calculated is less than or equal to the second default bit wide, Additional calculation is carried out to the data to be calculated by second adder, and the additional calculation result is exported.

Optionally, acquiring unit 10 is used for：Obtain the data to be calculated of the first computation layer in add tree, wherein institute It is any one computation layer in the add tree to state the first computation layer；Computing unit 20 is used for：Pass through the first adder Additional calculation is carried out to the data to be calculated of first computation layer and low level abandons, in terms of by the addition of first computation layer Calculate the target result of calculation that the high M results output in result is first computation layer.

Optionally, which is additionally operable to：When carrying out additional calculation to the input data of the second computation layer, described the is called One adder；Additional calculation is carried out to the data to be calculated of second computation layer using the first adder and low level is lost It abandons, to be in terms of the target of second computation layer by the high M results output in the additional calculation result of second computation layer Calculate result, wherein second computation layer is other computation layers being located in the add tree after first computation layer.

Optionally, computing unit 20 is additionally operable to：In the case where the first computation layer includes multigroup data to be calculated, one is determined A first adder, and be utilized respectively one first adder and additional calculation is carried out to data to be calculated described in every group It abandons and operates with low level, high M results in each additional calculation result are exported；Alternatively, in the first computation layer packet In the case of including multigroup data to be calculated, determines a first adder for data to be calculated described in every group, obtain multiple The first adder, and corresponding data progress additional calculation to be calculated and low level are lost using each first adder Operation is abandoned, high M results in the additional calculation result are exported.

Optionally, which is additionally operable to：In the case where having built the first adder, then call built it is described First adder；In the case of the unstructured first adder, then the first adder is built.

The technique effect and preceding method embodiment phase of the device that the embodiment of the present invention is provided, realization principle and generation Together, to briefly describe, device embodiment part does not refer to place, can refer to corresponding contents in preceding method embodiment.

In another embodiment, it additionally provides a kind of electronic equipment, including memory, processor and is stored in the storage On device and the computer program that can run on the processor, the processor are realized above-mentioned when executing the computer program The step of described method.

In another embodiment, a kind of computer for the non-volatile program code that can perform with processor is additionally provided Readable medium, said program code make the processor execute the method described in aforesaid way embodiment.

In another embodiment, additionally provide a kind of computer program, the computer program can store beyond the clouds or this On the storage medium on ground.Data when the computer program is run by computer or processor for executing the embodiment of the present invention The corresponding steps of processing method, and for realizing the corresponding module in data processing equipment according to the ... of the embodiment of the present invention.

In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected；It can Can also be electrical connection to be mechanical connection；It can be directly connected, can also indirectly connected through an intermediary, Ke Yishi Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition Concrete meaning in invention.

In the description of the present invention, it should be noted that term "center", "upper", "lower", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to Convenient for the description present invention and simplify description, do not indicate or imply the indicated device or element must have a particular orientation, With specific azimuth configuration and operation, therefore it is not considered as limiting the invention.In addition, term " first ", " second ", " third " is used for description purposes only, and is not understood to indicate or imply relative importance.

It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.

In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of division of logic function, formula that in actual implementation, there may be another division manner, in another example, multiple units or component can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be by some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical, machinery or other forms.

The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.

It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer read/write memory medium of a processor.Based on this understanding, of the invention Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention State all or part of step of method.And storage medium above-mentioned includes：USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with Store the medium of program code.

Finally it should be noted that：Embodiment described above, only specific implementation mode of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, it will be understood by those of ordinary skill in the art that：Any one skilled in the art In the technical scope disclosed by the present invention, it can still modify to the technical solution recorded in previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features；And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover the protection in the present invention Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. a kind of data processing method, which is characterized in that including：

Obtain data to be calculated, wherein the quantity of the data to be calculated is multiple, and each bit wide to be calculated for calculating data is small In or equal to N, N is the positive integer more than zero；

Additional calculation is carried out to the data to be calculated by first adder and low level abandons operation, by additional calculation result In high M results output be target result of calculation, wherein the bit wide of the additional calculation result be more than M, first addition The bit wide of device is N, and M is the positive integer less than or equal to N.

2. according to the method described in claim 1, it is characterized in that, it is described by the first adder to the number to be calculated It is that target result of calculation includes according to progress additional calculation, and by the high M results output in additional calculation result：

If the bit wide of the data to be calculated is more than or equal to the first default bit wide, by the first adder to institute It states data to be calculated and carries out additional calculation, and be the target result of calculation by the high M results output in additional calculation result.

3. method according to claim 1 or 2, which is characterized in that the method further includes：

If the bit wide of the data to be calculated is less than or equal to the second default bit wide, wait counting to described by second adder It counts according to progress additional calculation, and the additional calculation result is exported.

4. according to the method in any one of claims 1 to 3, which is characterized in that

It is described to obtain data to be calculated and include：Obtain the data to be calculated of the first computation layer in add tree, wherein described the One computation layer is any one computation layer in the add tree；

It is described that additional calculation is carried out to the data to be calculated by first adder, and by high M in additional calculation result As a result it exports and includes：Additional calculation and low level are carried out to the data to be calculated of first computation layer by the first adder It abandons, by the target that the high M results output in the additional calculation result of first computation layer is first computation layer Result of calculation.

5. according to the method described in claim 4, it is characterized in that, the method further includes：

Additional calculation is carried out to the data to be calculated of second computation layer using the first adder and low level abandons, it will High M results output in the additional calculation result of second computation layer is the target result of calculation of second computation layer, Wherein, second computation layer is other computation layers being located in the add tree after first computation layer.

6. method according to claim 4 or 5, which is characterized in that first computation layer includes multigroup data to be calculated, It is described that additional calculation is carried out to the data to be calculated of first computation layer by the first adder, and described first is counted Calculating the output of high M results in the additional calculation result of layer includes：

Determine a first adder, and be utilized respectively one first adder to data to be calculated described in every group into Row additional calculation and low level abandon operation, and high M results in each additional calculation result are exported；Or

A first adder is determined for data to be calculated described in every group, obtains multiple first adders, and utilize Each first adder carries out additional calculation to corresponding data to be calculated and low level abandons operation, in terms of by the addition Calculate the high M results output in result.

7. method according to any one of claim 1 to 6, which is characterized in that further include：

In the case of the unstructured first adder, then the first adder is built.

8. a kind of data processing equipment, which is characterized in that including：

Acquiring unit, for obtaining data to be calculated, wherein the quantity of the data to be calculated is multiple, each meter to be calculated The bit wide for the evidence that counts is less than or equal to N, and N is the positive integer more than zero；

Computing unit, for carrying out additional calculation and low level discarding operation to the data to be calculated by first adder, with It is target result of calculation by the high M results output in additional calculation result, wherein the bit wide of the additional calculation result is more than The bit wide of M, the first adder are N, and M is the positive integer less than or equal to N.

9. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor Capable computer program, which is characterized in that the processor realizes the claims 1 to 7 when executing the computer program Any one of described in method the step of.

10. a kind of computer-readable medium for the non-volatile program code that can perform with processor, which is characterized in that described The step of program code makes the processor execute the method described in any one of the claims 1 to 7.