CN106611216A - Computing method and device based on neural network - Google Patents

Computing method and device based on neural network Download PDF

Info

Publication number
CN106611216A
CN106611216A CN201611244816.4A CN201611244816A CN106611216A CN 106611216 A CN106611216 A CN 106611216A CN 201611244816 A CN201611244816 A CN 201611244816A CN 106611216 A CN106611216 A CN 106611216A
Authority
CN
China
Prior art keywords
neutral net
training
fixed point
convolution
binaryzation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611244816.4A
Other languages
Chinese (zh)
Inventor
周舒畅
梁喆
张宇翔
温和
周昕宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Beijing Aperture Science and Technology Ltd
Original Assignee
Beijing Megvii Technology Co Ltd
Beijing Aperture Science and Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Megvii Technology Co Ltd, Beijing Aperture Science and Technology Ltd filed Critical Beijing Megvii Technology Co Ltd
Priority to CN201611244816.4A priority Critical patent/CN106611216A/en
Publication of CN106611216A publication Critical patent/CN106611216A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the invention provides a computing method based on a neural network. The method comprises the steps of receiving an original image; extracting a feature tensor of the original image; and processing the feature tensor based on a trained fixed-point neural network to generate an image thermodynamic diagram. The embodiment of the invention provides the computing method in which the neural network is achieved by a fixed-point method, as fixed-point computation is adopted, a computation amount is small, fewer resources are occupied, and thus requirements on hardware is low.

Description

Computational methods and device based on neutral net
Technical field
The present invention relates to image processing field, relates more specifically to a kind of computational methods based on neutral net and device.
Background technology
Neutral net (Neural Networks, NNs) is also referred to as artificial neural network (Artificial Neural Networks, ANNs) or link model (Connection Model), it is a kind of imitation animal nerve network behavior feature, enter The algorithm mathematics model of row distributed parallel information processing.Neutral net relies on the complexity of system, internal big by adjustment The relation being connected with each other between gauge operator node, so as to reach the purpose of processing information.
Neutral net has had extensively in many fields such as speech recognition, Text region and image/video identification And successfully apply.Calculating is taken advantage of/added to traditional neutral net as basic computational ele- ment using double precision or single precision floating datum, However, the algorithm framework of traditional neutral net can cause intensive demand, high internal memory (or video memory) to take and high bandwidth The problems such as requirement, the requirement to hardware is higher.
The content of the invention
The present invention is proposed in view of the problems referred to above.The invention provides a kind of computational methods based on neutral net, Requirement to hardware is relatively low.
According to the first aspect of the invention, there is provided a kind of computational methods based on neutral net, including:
Receive original image;
Extract the characteristic tensor of the original image;
The characteristic tensor is processed based on the fixed point neutral net for training, to produce image thermodynamic chart.
Exemplarily, the fixed point neutral net for training is trained obtain by the following method:
Floating-point training at least one times, and the ginseng of the neutral net after the floating-point is trained are carried out to initial neutral net Number fixed point;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced The neutral net that is changed to after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the god after the fixed point Jing networks are used as the fixed point neutral net for training.
Exemplarily, the fixed point neutral net for training is binary neural network, described based on the fixed point for training Neutral net carries out process to the characteristic tensor to be included:
Each calculate node in the binary neural network performs following operation:
Convolution operation is carried out to input data;
The convolution of the convolution operation is exported into binaryzation.
Exemplarily, the convolution by the convolution operation exports binaryzation, including:
Convolution output is mapped as the number within [- 0.5,0.5] closed interval using nonlinear function;
Rounding-off operation is carried out to the number after the mapping, to realize the binaryzation of intermediate representation.
Exemplarily, the number to after the mapping carries out rounding-off operation includes:
Rounding-off operation is carried out using the number after -0.5 pair of mapping of formula Y=floor (y+1),
Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, and floor represented and round downwards.
Exemplarily, it is described that convolution operation is carried out to input data, including:
Inner product operation between matrix W and matrix A is realized by formula M=bitcount (W xor A),
Wherein, xor be step-by-step xor operation, bitcount calculate a two-value string in 1 number and return, M is two-value The result of convolution, W is weight matrix, and A is the input of activation amount.
Exemplarily, the training quota is minimized for the test function of test set.
Exemplarily, methods described is realized by the SoC including FPGA in camera.
According to the second aspect of the invention, there is provided a kind of computing device based on neutral net, including:
Receiver module, for receiving original image;
Characteristic extracting module, for extracting the characteristic tensor of the original image;
Thermodynamic chart generation module, at based on the fixed point neutral net for having trained to the characteristic tensor Reason, to produce image thermodynamic chart.
Exemplarily, also including training module, it is used for:
Floating-point training at least one times, and the ginseng of the neutral net after the floating-point is trained are carried out to initial neutral net Number fixed point;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced The neutral net that is changed to after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the god after the fixed point Jing networks are used as the fixed point neutral net for training.
Exemplarily, the fixed point neutral net for training be binary neural network, the thermodynamic chart generation module bag Convolution operation submodule and binaryzation submodule are included,
For each calculate node in the binary neural network:
The convolution operation submodule, for carrying out convolution operation to input data;
The binaryzation submodule, for the convolution of the convolution operation to be exported into binaryzation.
Exemplarily, the binaryzation submodule includes mapping subelement and rounding-off subelement:
The mapping subelement, for convolution output to be mapped as into [- 0.5,0.5] closed interval using nonlinear function Within number;
The rounding-off subelement, for carrying out rounding-off operation to the number after the mapping, to realize the two-value of intermediate representation Change.
Exemplarily, the rounding-off subelement, specifically for:
Rounding-off operation is carried out using the number after -0.5 pair of mapping of formula Y=floor (y+1),
Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, and floor represented and round downwards.
Exemplarily, the convolution operation submodule, specifically for:
Inner product operation between matrix W and matrix A is realized by formula M=bitcount (W xor A),
Wherein, xor be step-by-step xor operation, bitcount calculate a two-value string in 1 number and return, M is two-value The result of convolution, W is weight matrix, and A is the input of activation amount.
Exemplarily, the training quota is minimized for the test function of test set.
Exemplarily, described device is the SoC including FPGA in camera.
The device described in second aspect is implemented for the computational methods based on neutral net of aforementioned first aspect.
According to the third aspect of the invention we, there is provided a kind of computer chip, the computer chip includes processor and deposits Reservoir.The memory storage has instruction code, and the processor is used to perform the instruction code, and when the processor is held During row instruction code, the computational methods based on neutral net described in aforementioned first aspect can be realized.
The method that the employing fixed point method of the embodiment of the present invention realizes the calculating of neutral net, as a result of fixed point meter Calculate, amount of calculation is little, occupancy resource is few, so as to the requirement to hardware is relatively low, can run on FPGA.
Description of the drawings
The embodiment of the present invention is described in more detail by combining accompanying drawing, above-mentioned and other purposes of the present invention, Feature and advantage will be apparent from.Accompanying drawing is used for providing further understanding the embodiment of the present invention, and constitutes explanation A part for book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings, Identical reference number typically represents same parts or step.
Fig. 1 is a schematic block diagram of the electronic equipment of the embodiment of the present invention;
Fig. 2 is an indicative flowchart of the computational methods based on neutral net of the embodiment of the present invention;
Fig. 3 is an indicative flowchart of the method for the neutral net for obtaining training of the embodiment of the present invention;
Fig. 4 is a schematic diagrames of the FPGA with the annexation of other devices of the embodiment of the present invention;
Fig. 5 is a schematic block diagram of the computing device based on neutral net of the embodiment of the present invention;
Fig. 6 is another schematic block diagram of the computing device based on neutral net of the embodiment of the present invention.
Specific embodiment
In order that the object, technical solutions and advantages of the present invention become apparent from, root is described in detail below with reference to accompanying drawings According to the example embodiment of the present invention.Obviously, described embodiment is only a part of embodiment of the present invention, rather than this Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Described in the present invention The embodiment of the present invention, those skilled in the art's all other embodiment resulting in the case where creative work is not paid All should fall under the scope of the present invention.
The general algorithm framework of traditional neutral net is as follows:(1) input picture is extracted into into the form of tensor, it is incoming Trained good Floating-point Computation neutral net.(2) each calculate node in Floating-point Computation neutral net, by with floating multiplication The convolution operation of base unit is added as, and floating point calculations are sent to into next layer of calculate node.(3) through the calculating of multilayer After node computing, the output layer of final neutral net produces image segmentation thermodynamic chart (heatmap), then on its basis to original Figure carries out related mark.However, such framework often leads to intensive demand, high internal memory (or video memory) occupancy, Yi Jigao The problems such as bandwidth requirement, so as to the requirement to hardware is higher.Due to field programmable gate array (Field-Programmable Gate Array, FPGA) all very limited therefore traditional neutral net of the calculating of platform, storage capacity and bandwidth is difficult Run in FPGA platform.
The embodiment of the present invention proposes a kind of method that employing fixed point method realizes the calculating of neutral net, due to adopting Fixed-point computation, amount of calculation is little, and it is few to take resource, so as to the requirement to hardware is relatively low, can be in camera including FPGA On-chip system (system-on-a-chip, SoC) on run.
The embodiment of the present invention can apply to electronic equipment, and Fig. 1 show of the electronic equipment of the embodiment of the present invention Schematic block diagram.Electronic equipment 10 shown in Fig. 1 include one or more processors 102, one or more storage devices 104, Input unit 106, output device 108, imageing sensor 110 and one or more non-image sensors 114, these components lead to Cross bus system 112 and/or other forms interconnection.It should be noted that the component and structure of the electronic equipment 10 shown in Fig. 1 simply show Example property, and it is nonrestrictive, and as needed, the electronic equipment can also have other assemblies and structure.
The processor 102 can include CPU (Central Processing Unit, CPU) 1021 and/ Or graphics processing unit (Graphics Processing Unit, GPU) 1022, or including with data-handling capacity and/ Or the processing unit of the other forms of instruction execution capability, and other components in the electronic equipment 10 can be controlled to hold The desired function of row.
The storage device 104 can include one or more computer programs, and the computer program can With including various forms of computer-readable recording mediums, such as volatile memory 1041 and/or nonvolatile memory 1042.The volatile memory 1041 can for example include random access memory (Random Access Memory, RAM) And/or cache memory (cache) etc..The nonvolatile memory 1042 can for example include read-only storage (Read-Only Memory, ROM), hard disk, flash memory etc..One or many can be stored on the computer-readable recording medium Individual computer program instructions, processor 102 can run described program instruction, to realize various desired functions.In the meter Various application programs and various data can also be stored in calculation machine readable storage medium storing program for executing, such as application program use and/or Various data for producing etc..
The input unit 106 can be device of the user for input instruction, and can include keyboard, mouse, wheat One or more in gram wind and touch-screen etc..
The output device 108 can export various information (such as image or sound) to outside (such as user), and One or more in display, loudspeaker etc. can be included.
Described image sensor 110 can shoot the desired image of user (such as photo, video etc.), and will be captured Image be stored in the storage device 104 so that other components are used.
Work as attention, the component and structure of the electronic equipment 10 shown in Fig. 1 are exemplary, although the electronics shown in Fig. 1 Equipment 20 includes multiple different devices, but as needed, some of which device can not be necessary, some of which The quantity of device can be of the invention that this is not limited with more etc..
Fig. 2 is an indicative flowchart of the computational methods based on neutral net of the embodiment of the present invention, shown in Fig. 2 Method includes:
S101, receives original image.
Specifically, the original image gathered by camera can be received.Exemplarily, the original image is referred to as Input picture.
S102, extracts the characteristic tensor of the original image.
Specifically, original image can be extracted into the form of tensor so as to obtain the characteristic tensor of original image.
S103, is processed the characteristic tensor based on the fixed point neutral net for training, to produce image thermodynamic chart.
Exemplarily, in S103, each calculate node in the fixed point neutral net for training is performed:A () is to defeated Entering data carries out convolution operation;B the convolution of the convolution operation is exported fixed point by ().Wherein, the convolution behaviour for carrying out in (a) Make be one or many, the present invention does not limit this.
Exemplarily, the fixed point neutral net for training can be binary neural network.
So, in S103, each calculate node in binary neural network performs following operation:(a1) to being input into number According to carrying out convolution operation;(b1) convolution of the convolution operation is exported into binaryzation.Wherein, the convolution operation for carrying out in (a1) Can be one or many, the present invention is not limited this.
Because convolution can be considered the multiplication of weight matrix W, therefore can be entered with bit arithmetic using bitcount functions Row matrix inner product operation most time-consuming in taking advantage of.That is, above-mentioned (a1) can be included by formula M=bitcount (W xor A the inner product operation between matrix W and matrix A) is realized.
Exemplarily, above-mentioned (a1) can include two-value convolution operation, and the two-value convolution operation passes through formula M= Bitcount (W xor A) realizes the inner product operation between matrix W and matrix A.
Wherein, bitcount functions be used for calculate a two-value string in 1 number and return.Xor represents xor operation, i.e., Xor is step-by-step xor operation.M is the result of two-value convolution, and W is weight matrix, and A is activation amount (activation) input.Can See, formula M=bitcount (W xor A) and inner productIt is of equal value.
Exemplarily, above-mentioned (b1) can include:Using nonlinear function by the convolution output be mapped as [- 0.5, 0.5] number within closed interval;Rounding-off operation is carried out to the number after the mapping, to realize the binaryzation of intermediate representation.
Nonlinear function can be y=Tanh (x) × 0.5.Wherein, x represents that convolution is exported, and y represents the number after mapping.Can Understand, nonlinear function can also not be limited this for other functional forms, the present invention.
Exemplarily, the number to after the mapping carries out rounding-off operation can include:Using formula Y=floor (y+ 1) number after -0.5 pair of mapping carries out rounding-off operation.Wherein, y represents the number after the mapping, during Y is represented after binaryzation Between represent, floor represents downward bracket function.Floor sometimes also writes and is Floor, and its function is " rounding downwards ", in other words " to round down ", floor (y) is represented and is taken the no more than maximum integer of y.
It should be understood that in the embodiment of the present invention, the x and y of lowercase letter are generally floating number, the Y that capitalization is represented For fixed-point number, specifically, Y=0 or 1.
As can be seen here, above-mentioned (a1) operation can realize matrix W and square by formula M=bitcount (W xor A) Inner product operation between battle array A;Above-mentioned (b1) operation can be mapped as convolution output [- 0.5,0.5] using nonlinear function Number within closed interval simultaneously carries out rounding-off operation to the number after mapping.
In binary neural network, parameter only value 0 or 1, and intermediate representation also only value 0 or 1.Two-value network is to original There is the approximate of network, its all operations can be realized quickly in a computer, so as to reduce shared memory space, it is right to reduce The demand of computing resource.
Exemplarily, the fixed point neutral net for training in S103 can by the following method be trained and obtained:
Floating-point training at least one times, and the ginseng of the neutral net after the floating-point is trained are carried out to initial neutral net Number fixed point;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced The neutral net that is changed to after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the god after the fixed point Jing networks are used as the fixed point neutral net for training.
In various embodiments of the present invention, the parameter of neutral net can include weight W, fixed point decision threshold, convolution knot The parameters such as the side-play amount of fruit, determine, here is not defined with specific reference to actual conditions.
Specifically, the process can with as shown in figure 3, including:
S201, floating-point training at least one times is carried out to initial neutral net.
S202, the parameter fixed point of the neutral net after the floating-point is trained.
Whether S203, the parameter for judging the neutral net after the fixed point meets training quota.
If the result that S203 judges is yes, S205 is performed;If the result that S203 judges is no, S204 is performed.
S204, initial neutral net is replaced with the neutral net after fixed point.
Specifically, in S204, using the neutral net after the fixed point that S202 is obtained as the initial nerve net in S201 Network, then repeats S201 and S202.
S205, using the neutral net after the fixed point as the fixed point neutral net for training.
So, by training process as shown in Figure 3, the fixed point neutral net for training can just be obtained.
Exemplarily, training quota therein is minimized for the test function of test set.
In the embodiment of the present invention, the process of floating-point training may refer to the training process of general neural network, the i.e. present invention Floating-point training process in embodiment is similar with the training process of general neural network.Specifically, can be on the basis of test set Upper structure training set, based on lost functions, by steepest gradient method or conjugate gradient method etc., it is determined that so that test set is minimum The neural network parameter of change.
Used as one embodiment, the fixed point neutral net for training is binary neural network, then the two-value for training Neutral net can by the following method be trained and obtain:
Floating-point training at least one times, and the ginseng of the neutral net after the floating-point is trained are carried out to initial neutral net Number binaryzation;
If the parameter of the neutral net after the binaryzation is unsatisfactory for training quota, the initial neutral net is replaced The neutral net that is changed to after the binaryzation simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the binaryzation meets the training quota, by the god after the binaryzation Jing networks are used as the binary neural network for training.
That is, in binary neural network training process, can include:After a wheel training, will instruct Parameter after white silk carries out binaryzation, and next round training is then carried out again.Wherein one wheel training is including the training of floating-point at least one times. As such, it is possible to the intermediate result of training process is realized into binaryzation such that it is able to reduce shared memory space.
Exemplarily, the actual value of one-hot coding (one-hot encoding) can be used in training process.For example instruct Practice task to classify for 5, if some pixel belongs to the 4th class, being sorted in the pixel on five channels (channel) It is expressed as 00010.
Furthermore it is possible to the target thermodynamic chart for having K value is split as into the thermodynamic chart of K two-value.Wherein, the target of K value Thermodynamic chart is the thermodynamic chart for carrying out manual identification in advance, and K value of artificial mark can be floating number.
Exemplarily, it is possible to use the output of last layer convolutional layer is mapped as [- 0.5,0.5] closed interval by nonlinear function Within number;Rounding-off operation is carried out to the number after the mapping, to realize the binaryzation of intermediate representation.
Nonlinear function can be y=Tanh (x) × 0.5.Wherein, x represents the output of last layer convolutional layer, and y is represented and reflected Number after penetrating.It is understood that nonlinear function can also not be limited this for other functional forms, the present invention.
The number to after the mapping carries out rounding-off operation can be included:Using formula Y=floor (y+1) -0.5 couple Number after the mapping carries out rounding-off operation.Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, Floor represented and round downwards.
Because convolution can be considered the multiplication of matrix W, therefore square can be carried out with bit arithmetic using bitcount functions Battle array inner product operation most time-consuming in taking advantage of.Exemplarily, matrix W and square can be realized by formula M=bitcount (W xor A) Inner product operation between battle array A.Wherein, xor be step-by-step xor operation, bitcount calculate a two-value string in 1 number and return Return, M is the result of two-value convolution, W is weight matrix, and A is the input of activation amount.
It is understood that the method for the training shown in Fig. 3 can be performed before S101.Further optionally, S103 it Afterwards, can also include:The original image is labeled using described image thermodynamic chart.For example, can be to the original Beginning image carries out two-value mark.For example, the image thermodynamic chart that can be obtained according to S103, the live body region in original image is shown 1 is shown as, other non-live body regions are shown as 0.
Compared to 32 most time-consuming before floating-point convolution operations, the method for the embodiment of the present invention is only in two-value intermediate representation And two carried out between value parameter, in theory for, the method for the embodiment of the present invention can obtain the lifting of 32 times of arithmetic speeds, and energy Save the memory space of 32 times of intermediate representation.
Exemplarily, the method for the embodiment of the present invention can be by the on-chip system including FPGA in camera (System-on-a-chip, SoC) is realized.Specifically, the employing fixed point method of the embodiment of the present invention realizes neutral net The method of calculating, calculates as a result of two-value, and amount of calculation is little, and occupancy resource is few, so as to the requirement to hardware is relatively low, therefore can To run on FPGA.The information that camera is gathered can be directly inputted to FPGA rather than CPU, and binary neural network is run On the FPGA, interacted with CPU by FPGA, can so lift image processing speed.With directly with CPU process image phase Than performance has the lifting of several times.
As shown in figure 4, FPGA can obtain input picture from camera, and based on binary neural network to the input picture Processed.Afterwards, FPGA can be exported result, or result can be stored in into Double Data Rate synchronization In dynamic RAM (Double Data Rate, DDR), or can be communicated with CPU.
In the embodiment of the present invention, due to intermediate result being realized into binaryzation, therefore less memory space is taken, can To allow FPGA and the high-speed communication between DDR or FPGA and CPU.
Fig. 5 is a schematic block diagram of the computing device based on neutral net of the embodiment of the present invention.Dress shown in Fig. 5 Putting 50 includes receiver module 501, characteristic extracting module 502 and thermodynamic chart generation module 503.
Receiver module 501, for receiving original image;
Characteristic extracting module 502, for extracting the characteristic tensor of the original image;
Thermodynamic chart generation module 503, for being carried out to the characteristic tensor based on the fixed point neutral net for having trained Process, to produce image thermodynamic chart.
Exemplarily, as shown in fig. 6, also including training module 504, it is used for:
Floating-point training at least one times, and the ginseng of the neutral net after the floating-point is trained are carried out to initial neutral net Number fixed point;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced The neutral net that is changed to after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the god after the fixed point Jing networks are used as the fixed point neutral net for training.
Exemplarily, the fixed point neutral net for training is binary neural network.As shown in fig. 6, the thermodynamic chart Generation module 503 includes convolution operation submodule 5031 and binaryzation submodule 5032,
For each calculate node in the binary neural network:
The convolution operation submodule 5031, for carrying out convolution operation to input data;
The binaryzation submodule 5032, for the convolution of the convolution operation to be exported into binaryzation.
Exemplarily, the binaryzation submodule 5032 includes mapping subelement and rounding-off subelement.Mapping is single Unit, for the number being mapped as convolution output using nonlinear function within [- 0.5,0.5] closed interval.Rounding-off Unit, for carrying out rounding-off operation to the number after the mapping, to realize the binaryzation of intermediate representation.
Exemplarily, the rounding-off subelement, specifically for:Using -0.5 pair of mapping of formula Y=floor (y+1) Number afterwards carries out rounding-off operation.Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, and floor is represented Round downwards.
Exemplarily, the convolution operation submodule 5031, specifically for:By formula M=bitcount (W xor A) Realize the inner product operation between matrix W and matrix A.Wherein, xor is step-by-step xor operation, and bitcount calculates a two-value string In 1 number and return, M for two-value convolution result, W is weight matrix, and A is that activation amount is input into.
Exemplarily, the training quota is minimized for the test function of test set.
Exemplarily, described device is the SoC including FPGA in camera.
Device 50 shown in Fig. 5 and Fig. 6 is implemented for the calculating based on neutral net shown in aforementioned Fig. 2 and Fig. 3 Method.
In addition, the embodiment of the present invention additionally provides another kind of computing device based on neutral net, the device can include Processor and memory, wherein, memory is used for store instruction code, during the computing device instruction code, it is possible to achieve front State the computational methods based on neutral net shown in Fig. 2 and Fig. 3.
In addition, the embodiment of the present invention additionally provides a kind of electronic equipment, the electronic equipment can be included shown in Fig. 5 or Fig. 6 Device 50.
The method that the employing fixed point method of the embodiment of the present invention realizes the calculating of neutral net, as a result of fixed point meter Calculate, amount of calculation is little, occupancy resource is few, so as to the requirement to hardware is relatively low, can run on FPGA.
Although the example embodiment by reference to Description of Drawings here, it should be understood that above-mentioned example embodiment is merely exemplary , and be not intended to limit the scope of the invention to this.Those of ordinary skill in the art can wherein carry out various changes And modification, it is made without departing from the scope of the present invention and spirit.All such changes and modifications are intended to be included in claims Within required the scope of the present invention.
Those of ordinary skill in the art are it is to be appreciated that the list of each example with reference to the embodiments described herein description Unit and algorithm steps, being capable of being implemented in combination in electronic hardware or computer software and electronic hardware.These functions are actually Performed with hardware or software mode, depending on the application-specific and design constraint of technical scheme.Professional and technical personnel Each specific application can be used different methods to realize described function, but this realization it is not considered that exceeding The scope of the present invention.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, it can be passed through Its mode is realized.For example, apparatus embodiments described above are only schematic, for example, the division of the unit, and only Only a kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can be tied Close or be desirably integrated into another equipment, or some features can be ignored, or do not perform.
In specification mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the present invention and help understand one or more in each inventive aspect, exist To the present invention exemplary embodiment description in, the present invention each feature be grouped together into sometimes single embodiment, figure, Or in descriptions thereof.However, the method for the present invention should be construed to reflect following intention:It is i.e. required for protection The more features of feature that application claims ratio is expressly recited in each claim.More precisely, such as corresponding power As sharp claim reflects, its inventive point is can be with the spy of all features less than certain disclosed single embodiment Levy to solve corresponding technical problem.Therefore, it then follows it is concrete that thus claims of specific embodiment are expressly incorporated in this Separate embodiments of the embodiment, wherein each claim as the present invention itself.
It will be understood to those skilled in the art that in addition to mutually exclusive between feature, any combinations pair can be adopted All features and so disclosed any method disclosed in this specification (including adjoint claim, summary and accompanying drawing) Or all processes or unit of equipment are combined.Unless expressly stated otherwise, this specification (will including adjoint right Ask, make a summary and accompanying drawing) disclosed in each feature can, equivalent identical by offer or similar purpose alternative features replacing.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in detail in the claims, embodiment required for protection one of arbitrarily Can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor (DSP) to realize article analytical equipment according to embodiments of the present invention in some moulds The some or all functions of block.The present invention is also implemented as the part for performing method as described herein or complete The program of device (for example, computer program and computer program) in portion.Such program for realizing the present invention can be stored On a computer-readable medium, or can have one or more signal form.Such signal can be from internet Download on website and obtain, or provide on carrier signal, or provide in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame Claim.
The above, the only specific embodiment of the present invention or the explanation to specific embodiment, the protection of the present invention Scope is not limited thereto, any those familiar with the art the invention discloses technical scope in, can be easily Expect change or replacement, all should be included within the scope of the present invention.Protection scope of the present invention should be with claim Protection domain is defined.

Claims (16)

1. a kind of computational methods based on neutral net, it is characterised in that include:
Receive original image;
Extract the characteristic tensor of the original image;
The characteristic tensor is processed based on the fixed point neutral net for training, to produce image thermodynamic chart.
2. the method for claim 1, it is characterised in that the fixed point neutral net for training is instructed by the following method Get:
Floating-point training at least one times is carried out to initial neutral net, and the parameter of the neutral net after the floating-point is trained is determined Reveal;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced with Neutral net after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the nerve net after the fixed point Network is used as the fixed point neutral net for training.
3. method as claimed in claim 1 or 2, it is characterised in that the fixed point neutral net for training is two-value nerve Network, it is described process is carried out to the characteristic tensor based on the fixed point neutral net for training to include:
Each calculate node in the binary neural network performs following operation:
Convolution operation is carried out to input data;
The convolution of the convolution operation is exported into binaryzation.
4. method as claimed in claim 3, it is characterised in that the convolution by the convolution operation exports binaryzation, bag Include:
Convolution output is mapped as the number within [- 0.5,0.5] closed interval using nonlinear function;
Rounding-off operation is carried out to the number after the mapping, to realize the binaryzation of intermediate representation.
5. method as claimed in claim 4, it is characterised in that the number to after the mapping is carried out being rounded operation and included:
Rounding-off operation is carried out using the number after -0.5 pair of mapping of formula Y=floor (y+1),
Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, and floor represented and round downwards.
6. method as claimed in claim 3, it is characterised in that described that convolution operation is carried out to input data, including:
Inner product operation between matrix W and matrix A is realized by formula M=bitcount (W xor A),
Wherein, xor be step-by-step xor operation, bitcount calculate a two-value string in 1 number and return, M be two-value convolution Result, W is weight matrix, and A is the input of activation amount.
7. method as claimed in claim 2, it is characterised in that the training quota is minimized for the test function of test set.
8. the method as described in any one of claim 1 to 7, it is characterised in that methods described is by including in camera The on-chip system SoC of on-site programmable gate array FPGA is realized.
9. a kind of computing device based on neutral net, it is characterised in that include:
Receiver module, for receiving original image;
Characteristic extracting module, for extracting the characteristic tensor of the original image;
Thermodynamic chart generation module, for being processed the characteristic tensor based on the fixed point neutral net for having trained, with Produce image thermodynamic chart.
10. device as claimed in claim 9, it is characterised in that also including training module, be used for:
Floating-point training at least one times is carried out to initial neutral net, and the parameter of the neutral net after the floating-point is trained is determined Reveal;
If the parameter of the neutral net after the fixed point is unsatisfactory for training quota, the initial neutral net is replaced with Neutral net after the fixed point simultaneously repeats above-mentioned steps;
If the parameter of the neutral net after the fixed point meets the training quota, by the nerve net after the fixed point Network is used as the fixed point neutral net for training.
11. devices as described in claim 9 or 10, it is characterised in that the fixed point neutral net for training is two-value god Jing networks, the thermodynamic chart generation module includes convolution operation submodule and binaryzation submodule,
For each calculate node in the binary neural network:
The convolution operation submodule, for carrying out convolution operation to input data;
The binaryzation submodule, for the convolution of the convolution operation to be exported into binaryzation.
12. devices as claimed in claim 11, it is characterised in that the binaryzation submodule includes mapping subelement and rounding-off Subelement:
The mapping subelement, for convolution output to be mapped as within [- 0.5,0.5] closed interval using nonlinear function Number;
The rounding-off subelement, for carrying out rounding-off operation to the number after the mapping, to realize the binaryzation of intermediate representation.
13. devices as claimed in claim 12, it is characterised in that the rounding-off subelement, specifically for:
Rounding-off operation is carried out using the number after -0.5 pair of mapping of formula Y=floor (y+1),
Wherein, y represents the number after the mapping, and Y represents the intermediate representation after binaryzation, and floor represented and round downwards.
14. devices as claimed in claim 11, it is characterised in that the convolution operation submodule, specifically for:
Inner product operation between matrix W and matrix A is realized by formula M=bitcount (W xor A),
Wherein, xor be step-by-step xor operation, bitcount calculate a two-value string in 1 number and return, M be two-value convolution Result, W is weight matrix, and A is the input of activation amount.
15. devices as claimed in claim 10, it is characterised in that the training quota is minimum for the test function of test set Change.
16. devices as described in any one of claim 9 to 15, it is characterised in that described device is the bag in camera Include the on-chip system SoC of on-site programmable gate array FPGA.
CN201611244816.4A 2016-12-29 2016-12-29 Computing method and device based on neural network Pending CN106611216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611244816.4A CN106611216A (en) 2016-12-29 2016-12-29 Computing method and device based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611244816.4A CN106611216A (en) 2016-12-29 2016-12-29 Computing method and device based on neural network

Publications (1)

Publication Number Publication Date
CN106611216A true CN106611216A (en) 2017-05-03

Family

ID=58636214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611244816.4A Pending CN106611216A (en) 2016-12-29 2016-12-29 Computing method and device based on neural network

Country Status (1)

Country Link
CN (1) CN106611216A (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368857A (en) * 2017-07-24 2017-11-21 深圳市图芯智能科技有限公司 Image object detection method, system and model treatment method, equipment, terminal
CN107657312A (en) * 2017-09-18 2018-02-02 东南大学 Towards the two-value real-time performance system of voice everyday words identification
CN107766939A (en) * 2017-11-07 2018-03-06 维沃移动通信有限公司 A kind of data processing method, device and mobile terminal
CN108154229A (en) * 2018-01-10 2018-06-12 西安电子科技大学 Accelerate the image processing method of convolutional neural networks frame based on FPGA
CN108334946A (en) * 2018-02-13 2018-07-27 北京旷视科技有限公司 Deep neural network and its processing method, device and equipment
CN108334945A (en) * 2018-01-30 2018-07-27 中国科学院自动化研究所 The acceleration of deep neural network and compression method and device
CN108416426A (en) * 2018-02-05 2018-08-17 深圳市易成自动驾驶技术有限公司 Data processing method, device and computer readable storage medium
CN108496188A (en) * 2017-05-31 2018-09-04 深圳市大疆创新科技有限公司 Method, apparatus, computer system and the movable equipment of neural metwork training
CN108875924A (en) * 2018-02-09 2018-11-23 北京旷视科技有限公司 Data processing method, device, system and storage medium neural network based
CN108875482A (en) * 2017-09-14 2018-11-23 北京旷视科技有限公司 Object detecting method and device, neural network training method and device
CN108876790A (en) * 2017-09-14 2018-11-23 北京旷视科技有限公司 Image, semantic dividing method and device, neural network training method and device
CN109308517A (en) * 2018-09-07 2019-02-05 中国科学院计算技术研究所 Binaryzation device, method and application towards binary neural network
CN109359520A (en) * 2018-09-04 2019-02-19 汇纳科技股份有限公司 People counting method, system, computer readable storage medium and server
CN109543836A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109543835A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109558943A (en) * 2018-11-30 2019-04-02 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109754066A (en) * 2017-11-02 2019-05-14 三星电子株式会社 Method and apparatus for generating fixed-point type neural network
CN109961137A (en) * 2017-12-14 2019-07-02 北京中科寒武纪科技有限公司 Integrated circuit chip device and Related product
CN110210462A (en) * 2019-07-02 2019-09-06 北京工业大学 A kind of bionical hippocampus cognitive map construction method based on convolutional neural networks
CN111164604A (en) * 2017-09-26 2020-05-15 株式会社爱考斯研究 Information processing apparatus
WO2020098368A1 (en) * 2018-11-15 2020-05-22 北京嘉楠捷思信息技术有限公司 Adaptive quantization method and apparatus, device and medium
WO2020135601A1 (en) * 2018-12-29 2020-07-02 北京市商汤科技开发有限公司 Image processing method and device, vehicle-mounted operation platform, electronic device and system
CN111387938A (en) * 2020-02-04 2020-07-10 华东理工大学 Patient heart failure death risk prediction system based on feature rearrangement one-dimensional convolutional neural network
GB2586642A (en) * 2019-08-30 2021-03-03 Advanced Risc Mach Ltd Data processing
CN112840353A (en) * 2018-11-01 2021-05-25 赫尔实验室有限公司 Automatic generation of images satisfying attributes of a specified neural network classifier
WO2021143686A1 (en) * 2020-01-14 2021-07-22 杭州海康威视数字技术股份有限公司 Neural network fixed point methods and apparatuses, electronic device, and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005024625A1 (en) * 2003-08-28 2005-03-17 Hitachi Ulsi Systems Co., Ltd. Data processor
CN105654176A (en) * 2014-11-14 2016-06-08 富士通株式会社 Nerve network system, and training device and training method for training nerve network system
CN105760933A (en) * 2016-02-18 2016-07-13 清华大学 Method and apparatus for fixed-pointing layer-wise variable precision in convolutional neural network
CN106251338A (en) * 2016-07-20 2016-12-21 北京旷视科技有限公司 Target integrity detection method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005024625A1 (en) * 2003-08-28 2005-03-17 Hitachi Ulsi Systems Co., Ltd. Data processor
CN105654176A (en) * 2014-11-14 2016-06-08 富士通株式会社 Nerve network system, and training device and training method for training nerve network system
CN105760933A (en) * 2016-02-18 2016-07-13 清华大学 Method and apparatus for fixed-pointing layer-wise variable precision in convolutional neural network
CN106251338A (en) * 2016-07-20 2016-12-21 北京旷视科技有限公司 Target integrity detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SAJID ANWAR, KYUYEON HWANG, WONYONG SUNG: ""FIXED POINT OPTIMIZATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS FOR OBJECT RECOGNITION"", 《2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108496188A (en) * 2017-05-31 2018-09-04 深圳市大疆创新科技有限公司 Method, apparatus, computer system and the movable equipment of neural metwork training
CN107368857A (en) * 2017-07-24 2017-11-21 深圳市图芯智能科技有限公司 Image object detection method, system and model treatment method, equipment, terminal
CN108875482A (en) * 2017-09-14 2018-11-23 北京旷视科技有限公司 Object detecting method and device, neural network training method and device
CN108876790A (en) * 2017-09-14 2018-11-23 北京旷视科技有限公司 Image, semantic dividing method and device, neural network training method and device
CN108875482B (en) * 2017-09-14 2022-01-21 北京旷视科技有限公司 Object detection method and device and neural network training method and device
CN107657312A (en) * 2017-09-18 2018-02-02 东南大学 Towards the two-value real-time performance system of voice everyday words identification
CN111164604A (en) * 2017-09-26 2020-05-15 株式会社爱考斯研究 Information processing apparatus
CN111164604B (en) * 2017-09-26 2024-03-22 株式会社爱信 Information processing apparatus
CN109754066A (en) * 2017-11-02 2019-05-14 三星电子株式会社 Method and apparatus for generating fixed-point type neural network
CN107766939A (en) * 2017-11-07 2018-03-06 维沃移动通信有限公司 A kind of data processing method, device and mobile terminal
CN109961137A (en) * 2017-12-14 2019-07-02 北京中科寒武纪科技有限公司 Integrated circuit chip device and Related product
CN108154229A (en) * 2018-01-10 2018-06-12 西安电子科技大学 Accelerate the image processing method of convolutional neural networks frame based on FPGA
CN108154229B (en) * 2018-01-10 2022-04-08 西安电子科技大学 Image processing method based on FPGA (field programmable Gate array) accelerated convolutional neural network framework
CN108334945B (en) * 2018-01-30 2020-12-25 中国科学院自动化研究所 Acceleration and compression method and device of deep neural network
CN108334945A (en) * 2018-01-30 2018-07-27 中国科学院自动化研究所 The acceleration of deep neural network and compression method and device
CN108416426A (en) * 2018-02-05 2018-08-17 深圳市易成自动驾驶技术有限公司 Data processing method, device and computer readable storage medium
CN108875924A (en) * 2018-02-09 2018-11-23 北京旷视科技有限公司 Data processing method, device, system and storage medium neural network based
CN108334946A (en) * 2018-02-13 2018-07-27 北京旷视科技有限公司 Deep neural network and its processing method, device and equipment
CN109359520A (en) * 2018-09-04 2019-02-19 汇纳科技股份有限公司 People counting method, system, computer readable storage medium and server
CN109359520B (en) * 2018-09-04 2021-12-17 汇纳科技股份有限公司 Crowd counting method, system, computer readable storage medium and server
CN109308517A (en) * 2018-09-07 2019-02-05 中国科学院计算技术研究所 Binaryzation device, method and application towards binary neural network
CN112840353B (en) * 2018-11-01 2023-12-29 赫尔实验室有限公司 System, method and medium for automatically generating images and inputting images in training
CN112840353A (en) * 2018-11-01 2021-05-25 赫尔实验室有限公司 Automatic generation of images satisfying attributes of a specified neural network classifier
WO2020098368A1 (en) * 2018-11-15 2020-05-22 北京嘉楠捷思信息技术有限公司 Adaptive quantization method and apparatus, device and medium
CN109543835B (en) * 2018-11-30 2021-06-25 上海寒武纪信息科技有限公司 Operation method, device and related product
CN109558943B (en) * 2018-11-30 2021-05-04 上海寒武纪信息科技有限公司 Operation method, device and related product
CN109558943A (en) * 2018-11-30 2019-04-02 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109543835A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109543836A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
WO2020135601A1 (en) * 2018-12-29 2020-07-02 北京市商汤科技开发有限公司 Image processing method and device, vehicle-mounted operation platform, electronic device and system
JP2022512211A (en) * 2018-12-29 2022-02-02 ベイジン センスタイム テクノロジー デベロップメント シーオー.,エルティーディー Image processing methods, equipment, in-vehicle computing platforms, electronic devices and systems
CN110210462A (en) * 2019-07-02 2019-09-06 北京工业大学 A kind of bionical hippocampus cognitive map construction method based on convolutional neural networks
GB2586642A (en) * 2019-08-30 2021-03-03 Advanced Risc Mach Ltd Data processing
GB2586642B (en) * 2019-08-30 2022-03-30 Advanced Risc Mach Ltd Data processing
WO2021143686A1 (en) * 2020-01-14 2021-07-22 杭州海康威视数字技术股份有限公司 Neural network fixed point methods and apparatuses, electronic device, and readable storage medium
CN111387938A (en) * 2020-02-04 2020-07-10 华东理工大学 Patient heart failure death risk prediction system based on feature rearrangement one-dimensional convolutional neural network
CN111387938B (en) * 2020-02-04 2023-06-23 华东理工大学 Patient heart failure death risk prediction system based on characteristic rearrangement one-dimensional convolutional neural network

Similar Documents

Publication Publication Date Title
CN106611216A (en) Computing method and device based on neural network
CN107767408B (en) Image processing method, processing device and processing equipment
CN107506828B (en) Artificial neural network computing device and method for sparse connection
CN106855952A (en) Computational methods and device based on neutral net
CN106796716B (en) For providing the device and method of super-resolution for low-resolution image
JP2019535079A (en) Efficient data layout for convolutional neural networks
CN109685198A (en) Method and apparatus for quantifying the parameter of neural network
CN108875486A (en) Recongnition of objects method, apparatus, system and computer-readable medium
CN106779057B (en) Method and device for calculating binary neural network convolution based on GPU
CN106228238A (en) The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106529511A (en) Image structuring method and device
CN109416758A (en) The method of neural network and neural metwork training
CN109711534A (en) Dimensionality reduction model training method, device and electronic equipment
CN113313243A (en) Method, device and equipment for determining neural network accelerator and storage medium
CN108875482A (en) Object detecting method and device, neural network training method and device
CN114626503A (en) Model training method, target detection method, device, electronic device and medium
US20190205728A1 (en) Method for visualizing neural network models
US20180137408A1 (en) Method and system for event-based neural networks
CN108875924A (en) Data processing method, device, system and storage medium neural network based
CN112508190A (en) Method, device and equipment for processing structured sparse parameters and storage medium
CN108876790A (en) Image, semantic dividing method and device, neural network training method and device
JP2021507345A (en) Fusion of sparse kernels to approximate the complete kernel of convolutional neural networks
CN107402905A (en) Computational methods and device based on neutral net
CN107578055A (en) A kind of image prediction method and apparatus
CN108596328A (en) A kind of fixed point method and device, computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100190 Beijing, Haidian District Academy of Sciences, South Road, No. 2, block A, No. 313

Applicant after: MEGVII INC.

Applicant after: Beijing maigewei Technology Co., Ltd.

Address before: 100190 Beijing, Haidian District Academy of Sciences, South Road, No. 2, block A, No. 313

Applicant before: MEGVII INC.

Applicant before: Beijing aperture Science and Technology Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170503