CN109948795A - A kind of method and apparatus of determining network structure precision and delay Optimization point - Google Patents

A kind of method and apparatus of determining network structure precision and delay Optimization point Download PDF

Info

Publication number
CN109948795A
CN109948795A CN201910181390.XA CN201910181390A CN109948795A CN 109948795 A CN109948795 A CN 109948795A CN 201910181390 A CN201910181390 A CN 201910181390A CN 109948795 A CN109948795 A CN 109948795A
Authority
CN
China
Prior art keywords
network structure
delay
precision
training
optimization point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910181390.XA
Other languages
Chinese (zh)
Other versions
CN109948795B (en
Inventor
李鑫
潘争
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uisee Technologies Beijing Co Ltd
Original Assignee
Uisee Technologies Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uisee Technologies Beijing Co Ltd filed Critical Uisee Technologies Beijing Co Ltd
Priority to CN201910181390.XA priority Critical patent/CN109948795B/en
Publication of CN109948795A publication Critical patent/CN109948795A/en
Application granted granted Critical
Publication of CN109948795B publication Critical patent/CN109948795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The method for being designed to provide a kind of determining network structure precision and delay Optimization point of the application.This method estimates the delay of multiple network structures first, and the delay of each network structure in the multiple network structure is the summation of the delay of all modules in the network structure;Then, the delay based on the multiple network structure determines sub- search space;Finally, selecting at least one network structure as search result from the sub- search space.This method reduces search space using the delay of the network structure of estimation, so that network structure search is more efficient.

Description

A kind of method and apparatus of determining network structure precision and delay Optimization point
Technical field
The present invention relates to field of neural networks more particularly to network structure to search for.
Background technique
The rise and development of neural network, so that configuration neural network structure (also known as network structure search) becomes people Urgent problem to be solved.In recent years, neural network is configured in embedded equipment becomes a popular research object, especially It is automatic Pilot field.
Different from terminal installation, the computing capability of embedded equipment is limited, this, which allows for configuration neural network structure, has Many restrictions.Wherein, most distinct issues are slow for large-scale neural network structure calculating speed.Existing method is usually Speed is promoted to sacrifice performance as cost.Obviously we need to find a kind of neural network knot of speed and precision optimum balance Structure model.
Summary of the invention
On the one hand the application provides the method for a kind of determining network structure precision and delay Optimization point.The described method includes: Estimate the delay of multiple network structures, each network structure in the multiple network structure includes one or more module, The delay of each network structure in the multiple network structure is prolonging for one or more module described in the network structure When summation;Sub- search space is determined based on the delay of the multiple network structure;It is selected at least from the sub- search space One network structure is as search result.
In some embodiments, the determination network structure precision and the method for delay Optimization point are it is characterized in that, institute It states and estimates that the delay of multiple network structures includes: each network structure in the multiple network structure of passage capacity analysis acquisition In each module delay.It is added after final version.
In some embodiments, the method for the determination network structure precision and delay Optimization point, which is characterized in that institute Stating and selecting at least one network structure from the sub- search space as search result includes: the first step, determines beta pruning set, The beta pruning collection is combined into the set of the network structure in the sub- search space needed to delete;Second step, from the beta pruning set Any network structure is selected in the supplementary set of the relatively described sub- search space;Third step, training any network structure simultaneously obtain The precision of any network structure after training;Training set, initial training is added in any network structure after training by the 4th step Collection is combined into empty set;5th step selects at least one network structure as search result, described search knot from the training set The precision of fruit is greater than training set middle part subnetwork structure, and the delay of the subnetwork structure is less than described search knot Fruit;The first step is repeated to the 5th step, until the difference of continuously determining at least twice search result is less than preset value or until repeating Number is equal to preset threshold.
In some embodiments, the method for the determination network structure precision and delay Optimization point, which is characterized in that institute Stating the first step includes: that each network structure in the training set is determined relative to the optimal of each network structure Network structure, the optimum network structure are that precision is greater than in the network structure of threshold value the smallest network structure that is delayed;Really The preceding value of fixed each network structure, the preceding value finger widths and depth are respectively less than the network structure of each network structure;It will The beta pruning set is added in the preceding value that delay is greater than each network structure of the optimum network structure, and initial beta pruning collection is combined into Empty set.
In some embodiments, the method for the determination network structure precision and delay Optimization point, which is characterized in that every The delay of the preceding value of other a network structures and precision are respectively less than other network structures.
In some embodiments, the method for the determination network structure precision and delay Optimization point, which is characterized in that institute The method of stating further comprises: constructing the multiple network structure, the multiple network structure is backbone network structure, the backbone The first layer and the second layer of network structure respectively include one layer of convolutional layer.
In some embodiments, the method for the determination network structure precision and delay Optimization point, which is characterized in that institute The method of stating further comprises: constructing the multiple network structure, the multiple network structure is decoding network structure, the decoding Network structure includes the width control device connecting with backbone network structure, and the width control device is the convolutional layer of 1*1.
A kind of system of determining network structure precision and delay Optimization point, comprising: at least one storage equipment, the storage Equipment includes one group of instruction;And at least one processor communicated at least one described storage equipment, wherein when execution institute When stating one group of instruction, at least one described processor is for making the system: estimate the delay of multiple network structures, it is the multiple Each network structure in network structure includes one or more module, each network structure in the multiple network structure Delay be the network structure described in one or more module delay summation;Based on prolonging for the multiple network structure When determine sub- search space;Select at least one network structure as search result from the sub- search space.
In some embodiments, the system of the determination network structure precision and delay Optimization point, which is characterized in that be Select at least one network structure as search result from the sub- search space, described at least one processor is for making The system: the first step determines beta pruning set, and the beta pruning collection is combined into the network structure in the sub- search space needed to delete Set;Second step selects any network structure from the supplementary set of the relatively described sub- search space of the beta pruning set;Third Step, training any network structure and the precision for obtaining any network structure after training;4th step, by any after training Training set is added in network structure, and initial training collection is combined into empty set;5th step selects at least one net from the training set For network structure as search result, the precision of described search result is greater than training set middle part subnetwork structure, the part The delay of network structure is less than described search result;The first step is repeated to the 5th step, until continuously determining at least twice search As a result difference is less than preset value or until number of repetition is equal to preset threshold.
In some embodiments, the system of the determination network structure precision and delay Optimization point, which is characterized in that institute Stating the first step includes: that each network structure in the training set is determined relative to the optimal of each network structure Network structure, the optimum network structure are that precision is greater than in the network structure of threshold value the smallest network structure that is delayed;Really The preceding value of fixed each network structure, the preceding value finger widths and depth are respectively less than the network structure of each network structure;It will The beta pruning set is added in the preceding value that delay is greater than each network structure of the optimum network structure, and initial beta pruning collection is combined into Empty set.
On the other hand the application provides the device of a kind of determining network structure precision and delay Optimization point.Described device can be with The method for executing above-mentioned determining network structure precision and delay Optimization point.
Other feature will be set forth in part in the description in the application.By the elaboration, make the following drawings and The content of embodiment narration becomes apparent for those of ordinary skills.Inventive point in the application can pass through Practice is sufficiently illustrated using method described in detailed example discussed below, means and combinations thereof.
Detailed description of the invention
Exemplary embodiment disclosed in this application is described in detail in the following drawings.Wherein identical appended drawing reference is in attached drawing Several views in indicate similar structure.Those of ordinary skill in the art will be understood that these embodiments be non-limiting, Exemplary embodiment, the purpose that attached drawing is merely to illustrate and describes, it is no intended to it limits the scope of the present disclosure, other modes Embodiment may also similarly complete the intention of the invention in the application.It should be appreciated that the drawings are not drawn to scale.Wherein:
Fig. 1 shows exemplary determining network structure precision and delay Optimization according to shown in some embodiments of the present application The device of point;
Fig. 2 shows the sides that network structure precision and delay Optimization point are determined according to shown in some embodiments of the present application The flow chart of method;
Fig. 3 is shown selects at least one network knot according to shown in some embodiments of the present application from sub- search space Flow chart of the structure as search result;
Fig. 4 shows the segmentation network model of the exemplary semantics according to shown in some embodiments of the present application;
Fig. 5 shows the exemplary backbone network structure according to shown in some embodiments of the present application;
Fig. 6 shows the exemplary residual error module according to shown in some embodiments of the present application;
Fig. 7 shows the exemplary fused node according to shown in some embodiments of the present application;
Fig. 8 shows the linear relationship chart between the delay of estimation and the delay of performance evaluation;
Fig. 9 shows the relational graph between network structure and its low precision and delay inequality of preceding value;
Figure 10 shows the variation schematic diagram between heterogeneous networks constructional depth and width;
Figure 11 shows sub- beta pruning set, final search result, sub- search space;
Figure 12 shows determining network structure precision and delay Optimization point according to shown in some embodiments of the present application The search result of method is compared with other methods.
Specific embodiment
Following description provides the specific application scene of the application and requirements, it is therefore an objective to those skilled in the art be enable to make It makes and using the content in the application.To those skilled in the art, to the various partial modifications of the disclosed embodiments Be it will be apparent that and without departing from the spirit and scope of the disclosure, the General Principle that will can be defined here Applied to other embodiments and application.Therefore, the embodiment the present disclosure is not limited to shown in, but it is consistent most wide with claim Range.
Term used herein is only used for the purpose of description specific example embodiments, rather than restrictive.For example, unless Context is expressly stated otherwise, used herein above, singular " one ", "one" and "the" also may include plural form. When used in this manual, the terms "include", "comprise" and/or " containing " are meant that associated integer, step, behaviour Make, element and/or component exist, but be not excluded for other one or more features, integer, step, operation, element, component and/or Group presence or can be added in the system/method other features, integer, step, operation, element, component and/or.
In view of being described below, the operation of the related elements of these features of the disclosure and other features and structure and The economy of combination and the manufacture of function and component may be significantly raising.With reference to attached drawing, all these formation disclosure A part.It is to be expressly understood, however, that the purpose that attached drawing is merely to illustrate and describes, it is no intended to limit the disclosure Range.
Process used in the disclosure shows the operation realized according to the system of some embodiments in the disclosure.It answers This is expressly understood, and the operation of flow chart can be realized out of order.On the contrary, operation can be realized with reversal order or simultaneously. Furthermore, it is possible to other one or more operations of flow chart addition.One or more operations can be removed from flow chart.
The method for being designed to provide a kind of determining network structure precision and delay Optimization point of the application.Firstly, the party Method selects the network structure being delayed in preset range as sub- search space by the delay of estimation network structure, thus Reduce network structure search space.Then, this method is based on partial ordering relation hypothesis and deletes beta pruning space, further reduces sub- search Space.Finally, this method iteratively selection network structure training in the sub- search space after diminution, and then select speed and essence Spend the network structure of optimum balance.This method selects the network structure after training, has comprehensively considered actual motion platform Computing capability so that the network structure of final choice is more accurate, meets expection.
Fig. 1 shows exemplary determining network structure precision and delay Optimization according to shown in some embodiments of the present application The device of point.
Determine that the device 100 of network structure precision and delay Optimization point can execute determination network knot disclosed herein The method of structure precision and delay Optimization point.In some embodiments, the device 100 of network structure precision and delay Optimization point is determined It can be calculating equipment.Determination network structure precision and delay Optimization point disclosed herein can be implemented in the calculating equipment Method.
As an example, the device 100 for determining network structure precision and delay Optimization point may include being connected to be connected thereto Network COM port 150, in order to data communication.Determine that the device 100 of network structure precision and delay Optimization point can be with Including processor 120, processor 120 is used for computer instructions in the form of one or more processors.Computer instruction May include the routine for for example executing specific function described herein, program, object, component, data structure, process, module and Function.For example, processor 120 can be evaluated whether the delay of multiple network structures, the network structure delay for being then based on estimation is determined Sub- search space.In another example processor 120 can determine beta pruning set, it is then based on beta pruning Set-search speed and precision most Good network structure.
In some embodiments, processor 120 may include one or more hardware processors, such as microcontroller, micro- Processor, Reduced Instruction Set Computer (RISC), specific integrated circuit (ASIC), specific to instruction-set processor of application (ASIP), central processing unit (CPU), graphics processing unit (GPU), physical processing unit (PPU), micro controller unit, number Word signal processor (DSP), field programmable gate array (FPGA), Advance RISC Machine (ARM), programmable logic device (PLD), it is able to carry out any circuit or the processor etc. of one or more functions, or any combination thereof.
As an example, the device 100 for determining network structure precision and delay Optimization point may include internal communication bus 110, (for example, disk 170, read-only memory (ROM) 130 or arbitrary access are deposited for program storage and the storage of various forms of data Reservoir (RAM) 140) for the various data files by computer disposal and/or transmission.It determines network structure precision and is delayed excellent The device 100 for changing point can also include being stored in ROM 130, RAM 140 and/or the other types that will be executed by processor 120 Non-transitory storage medium in program instruction.The present processes and/or process can be used as program instruction realization.It determines The device 100 of network structure precision and delay Optimization point further includes I/O component 160, support computer and other assemblies (for example, User interface elements) between input/output.Determine that network structure precision and the device 100 of delay Optimization point can also pass through Network communication receives programming and data.
Just to describe the problem, determined in the device 100 of network structure precision and delay Optimization point in this application only Describe a processor.It should be noted, however, that determining the device 100 of network structure precision and delay Optimization point in the application Can also include multiple processors, therefore, operation disclosed in this application and/or method and step can be as described in the present disclosure by One processor executes, and can also be combined by multiple processors and be executed.For example, if determining network structure precision in this application Step A and step B is executed with the processor 220 of the device 100 of delay Optimization point, then it should be understood that step A and step B can also Jointly or separately executed by two different processors in information processing (for example, first processor executes step A, at second Reason device executes step B or the first and second processors execute step A and B jointly).
Fig. 2 shows the sides that network structure precision and delay Optimization point are determined according to shown in some embodiments of the present application The flow chart of method.Process 200 may be embodied as determining the non-transitory in the device 100 of network structure precision and delay Optimization point One group of instruction in storage medium.Determine that the device 100 of network structure precision and delay Optimization point can execute one group of instruction And it can correspondingly execute the step in process 200.
The operation of shown process 200 presented below, it is intended to be illustrative and be not restrictive.In some embodiments In, process 200 can add one or more operation bidirectionals not described when realizing, and/or delete one or more herein Described operation.In addition, shown in Fig. 2 and operations described below sequence limits not to this.
In 210, determine that network structure precision and the device 100 of delay Optimization point can be evaluated whether prolonging for multiple network structures When.Each network structure in the multiple network structure includes one or more module, in the multiple network structure The delay of each network structure is the summation of the delay of one or more module described in the network structure.
Determine that the device 100 of network structure precision and delay Optimization point can analyze each network structure of acquisition with passage capacity In each module delay, then the delayed addition of modules all in the network structure be can be obtained into prolonging for each network structure When.The performance analysis tool can be TensorRT library.
Generally, the delay of the module of same configuration is essentially identical.Accordingly, it is determined that network structure precision and delay Optimization point Device 100 can establish delay look-up table, and the delay look-up table includes the delay of different configuration of module.Module is configured to The size ratio of input picture feature and output characteristics of image, can be expressed as (ci,hi,wi,co,ho,wo).Wherein, ciAnd coRespectively It indicates input picture feature and exports the number of channels of characteristics of image, hi,wiAnd ho,woIt is (high to respectively indicate corresponding bulk And length).For example, determining that the device of network structure precision and delay Optimization point can be found according to delay look-up table and being configured to (32,112,112,64,56,56) delay on computing chip is 0.143 millisecond.
Fig. 8 shows the linear relationship chart between the delay of estimation and the delay of performance evaluation.As shown in Figure 8, estimation In y=x linear relationship, that is, the delay of the delay and performance evaluation estimated is of substantially equal for the delay that delay and performance evaluation obtain. Therefore, it is more accurately that the method for passage capacity analysis, which obtains the delay of estimation,.
In 220, determine that the device 100 of network structure precision and delay Optimization point can be based on the multiple network structure Delay determine sub- search space.
Above-mentioned multiple network structures constitute search space.Determine that the device 100 of network structure precision and delay Optimization point can With the delay based on network structure, selected section network structure, the subnetwork structure constitute son search sky from search space Between.Determine that the device 100 of network structure precision and delay Optimization point can choose delay between preset range [Tmin, Tmax] Network structure constitutes the sub- search space.The preset range [Tmin, Tmax] can comprehensively consider the calculating of operation platform Ability, service requirement setting.
In 230, determine that the device 100 of network structure precision and delay Optimization point can be selected from the sub- search space At least one network structure is selected as search result.
At least one above-mentioned network structure can be the network structure of speed and precision optimum balance.The speed and precision Optimum balance indicates that the precision of the network structure is greater than the faster network structure of speed in sub- search space, and the network structure Speed be greater than sub- search space in the higher network structure of precision.
It determines that network structure precision and the device 100 of delay Optimization point can determine beta pruning set, is then based on described cut Branch set selected network structure is trained, and finally selects at least one network structure as searching in network structure after training Hitch fruit.The beta pruning collection is combined into the set of the network structure in the sub- search space needed to delete.More about step 230 Description can participate in attached drawing 3 and its explanation.
Above-mentioned multiple network structures can be the device 100 of determining network structure precision and delay Optimization point from network structure It is obtained in database, is also possible to determine what the device 100 of network structure precision and delay Optimization point constructed.About the latter, Process 200 may further include, and determine that network structure precision and the device 100 of delay Optimization point construct multiple network structures.
As shown in figure 4, determining the device 100 of network structure precision and delay Optimization point for semantic segmentation network model Backbone network structure and decoding network structure can be constructed.
As an example, determining that backbone network structure that the device 100 of network structure precision and delay Optimization point constructs can be with Including 6 layers (stage), taxonomic structure is generated according to the image of input.First layer to layer 5 can be with down-sampling (for example, step-length For the image of 2) input, layer 6 includes global average pondization and full articulamentum.First layer and the second layer are used for from input picture The middle feature for extracting inferior grade, generally brings a large amount of computation burden.Accordingly, it is determined that network structure precision and delay Optimization point The backbone network structure first layer and the second layer that device 100 constructs respectively include one layer of convolutional layer, to promote network structure Efficiency.Third layer to layer 5 can contain L, M, N number of residual error module respectively.As shown in figure 5, residual error module may include two Convolutional layer and a shortcut connection.L, M, N belong to positive integer.It should be appreciated that the number of every layer of residual error module Amount is independent from each other, i.e. the value of L, M, N are independent from each other.By converting the value of L, M, N, network structure precision is determined Multiple backbone network structures can be constructed with the device 100 of delay Optimization point.
It should be pointed out that the quantity of every layer of residual error module can indicate the depth of network structure.In each residual error module The quantity in channel (channel) can indicate the width of residual error module.For the ease of narration, Ke YiyongIndicate s layer i-th The width of residual error module, s value 3,4 or 5.GenerallyIt can be 64,128,256,512 and 1024.
When determining that the device 100 of network structure precision and delay Optimization point constructs backbone network structure, backbone network structure First layer, the second layer and layer 6 structure be fixed, only third layer, the quantity of the residual error module of the 4th layer and layer 5 The quantity (that is, width) in the channel of (that is, depth) and each residual error module is different.Therefore, can with the depth of network structure and The width of residual error module goes to indicate backbone network structure, or to backbone network structured coding.For example, can use Indicate backbone network structure.Fig. 6 shows root According to exemplary residual error module shown in some embodiments of the present application.
As an example, determining that decoding network structure that the device 100 of network structure precision and delay Optimization point constructs can be with Including the width control device connecting with backbone network structure, the width control device is the convolutional layer of 1*1.As shown in figure 4, decoding Network structure may include three width control devices, three width control devices respectively with the third layer of backbone network structure, 4th layer connects with layer 5.Decoding network structure includes aggregators, and Fig. 7 shows illustrative aggregators.
It should be pointed out that the quantity in channel (channel) can indicate width control device in each width control device Width.For the ease of narration, C can be usedsIndicate the width of s layers of width control device, s value 3,4 or 5.By converting C3、C4、 C5Value, determine that the device 100 of network structure precision and delay Optimization point can construct multiple decoding network structures.
Determining network structure precision disclosed in the present application and the method and apparatus of delay Optimization point are suitable for various network knots Structure is described below and uses backbone network structure as example.
Fig. 3 is shown selects at least one network knot according to shown in some embodiments of the present application from sub- search space Flow chart of the structure as search result.Process 300 may be embodied as determining the device 100 of network structure precision and delay Optimization point In non-transitory storage medium in one group of instruction.Determine that network structure precision and the device 100 of delay Optimization point can be held Row and can correspondingly execute the step in process 300 at one group of instruction.
The operation of shown process 300 presented below, it is intended to be illustrative and be not restrictive.In some embodiments In, process 300 can add one or more operation bidirectionals not described when realizing, and/or delete one or more herein Described operation.In addition, being limited with the sequence of operations described below not to this shown in Fig. 3.
In 310, determine that network structure precision and the device 100 of delay Optimization point can determine beta pruning set.
Described above, the beta pruning collection is combined into the set of the network structure in the sub- search space needed to delete, i.e., Network structure needs in beta pruning set are deleted from sub- search space, to further reduce sub- search space.
Determine that the device 100 of network structure precision and delay Optimization point can execute the determining update beta pruning collection of following steps It closes.
For each network structure (that is, w) in the training set:
First, determine optimum network structure (the i.e. y relative to each network structurew).The training set is combined into training The set of network structure afterwards, initial training collection are combined into empty set.The optimum network structure is that precision is greater than threshold value and delay most Small network structure.For example, the optimum network structure is that precision is greater than each network structure and the smallest network knot of delay Structure, as shown in formula (1).
Wherein, ywFor the optimum network structure relative to w, D is training set, w be each element of trained set D (i.e. Each network structure).
Second, determine the preceding value of each network structure.Value finger widths and depth are respectively less than each network knot before described The network structure of structure.
Figure 10 shows the variation schematic diagram between heterogeneous networks constructional depth and width.Wherein, solid arrow indicates net The depth of network structure reduces, and dotted arrow indicates that the width of network structure reduces.For example, network structure [(128), (256, 256), (512)] depth before value be [(128), (256), (512)], this is because the quantity of the 4th layer of residual error module by 2 become It is 1.In another example before the width of network structure [(128), (256,256), (512)] value for [(128), (256,256), (256)], this is because the channel of the 5th layer of first residual error module becomes 256 from 512.
It should be noted that under normal circumstances, the delay of network structure and precision are greater than value before its, referred to as Partial ordering relation is assumed.For example, network structure x is the preceding value of network structure y, the delay of network structure x and precision are less than net Network structure y's, as shown in formula (2).
Lat (x)≤Lat (y) Acc (x)≤Acc (y), (2)
Wherein, Lat (x) and Lat (y) respectively indicates the delay of x and y, and Acc (x) and Acc (y) respectively indicate the essence of x and y Degree.
Fig. 9 shows the relational graph between network structure and its low precision and delay inequality of preceding value.As shown in figure 9, network The point overwhelming majority that the low precision and delay inequality of structure and its preceding value determine is located at first quartile.It follows that a network knot Value is available experimental data support before the delay of structure and precision are greater than it, has very high standard in a certain range Exactness.
The beta pruning collection is added in the preceding value that delay is greater than each network structure of the optimum network structure by third It closes, initial beta pruning collection is combined into empty set.
In conjunction with above, the delay of network structure and precision are greater than value before its, then before each network structure w The precision of value is respectively less than each network structure w, and then is also less than the optimum network structure relative to each network structure w yw.As shown in formula (3).
Wherein, m is the element of beta pruning set, ywFor the optimum network structure relative to w, w is each member of training set D Element, Δ PwFor the subset of the beta pruning set determined by w.
So, delay is greater than ywW preceding value delay be greater than ywWhile, precision is less than yw.Therefore, it can will be delayed Greater than ywW preceding value be added beta pruning set.
From the foregoing, it will be observed that a specific network structure can determine a subset of beta pruning set, as shown in formula (4).
Wherein, m is the element of beta pruning set, and w is each element in training set, and m < w indicates that m is the preceding value of w, and yw is Relative to the optimum network structure of w, Δ PwFor the subset of the beta pruning set determined by w,Indicate sub- search space.
So, beta pruning set can be the subset complete or collected works of the all-network structure w beta pruning set determined.Such as formula (5) institute Show.
P=∪wΔPw, formula (5)
Wherein, P is beta pruning set, and w is each element in training set, Δ PwFor the son of the beta pruning set determined by w Collection.
In 320, determine that the device 100 of network structure precision and delay Optimization point can be from the beta pruning set with respect to institute It states and selects any network structure in the supplementary set of sub- search space.
The beta pruning set is relative to the sub- search space that the supplementary set of the sub- search space is after reducing.Determine network The device 10 of structure precision and delay Optimization point can randomly choose a network structure from the search space after the diminution.
In 330, determine that the device 100 of network structure precision and delay Optimization point can train any network structure And obtain the precision of any network structure after training.
In 340, determine that the device 100 of network structure precision and delay Optimization point can be by any network knot after training Training set is added in structure.Initial training collection is combined into empty set, one network structure of every training, and training set increases an element.
In 350, determine that the device 100 of network structure precision and delay Optimization point can be selected from the training set At least one network structure is as search result.The precision of described search result is greater than training set middle part subnetwork knot The delay of structure, the subnetwork structure is less than described search result.As shown in formula (6).
Wherein, B (D) is search result, and D is training set, and w is the element of training set D, and x is search result B (D) Element, Lat (w) and Lat (x) respectively indicate the delay of w and x, and Acc (w) and Acc (x) respectively indicate the precision of w and x.
Determine that the device 100 of network structure precision and delay Optimization point repeats the first step to the 5th step, until continuously at least The difference of determining search result is less than preset value (for example, determining search result is identical at least twice) or secondary until repeating twice Number is equal to preset threshold.It is often repeated once, the network structure after training in training set increases by one, and beta pruning set is therewith more Newly.When the difference of continuously determining at least twice search result is less than preset value or number of repetition is equal to preset threshold, net is determined The device 100 of network structure precision and delay Optimization point stops search, and obtains final search result.
Figure 11 shows sub- beta pruning set, final search result, sub- search space.In training set, in addition to optimal net Network structure, each network structure is (for example, w1、w2、w3) available beta pruning set subset is (for example, Pw1、Pw2、Pw3).It is all The available beta pruning set P of the subset union of beta pruning set.Determine that the device 100 of network structure precision and delay Optimization point can be with In beta pruning set relative to sub- search spaceSupplementary setIn select any network structure to be trained, form training set Close D;Then, final search result B (D) is determined in training set D.Obviously, search result B (D) is the son of training set D Collection.
Figure 12 shows determining network structure precision and delay Optimization point according to shown in some embodiments of the present application Method is compared with the search result of other methods.In experiment, the network structure in search space is semantic segmentation network model In backbone network structure.DF1, DF2 and DF2A are according to determining network structure precision disclosed in the present application and delay Optimization point Method
More detailed experimental result is shown in Table one.Wherein, FLOPs indicates Floating-point Operations, indicates Flops.
As shown in Table 1, compared with MobileNet V1 and MobileNet V2, the precision of DF2 distinguishes high 2.2% He 1.1%, and the difference low 18% and 43% that is delayed;Compared with ShuffleNet V1 and ShuffleNet V2, the precision of DF1 and its Quite, but it is delayed difference low 47% and 39%.It follows that (for example, accurately the characteristics of considering carrying out practically platform Delay) when, the search result that the method for determining network structure precision disclosed in the present application and delay Optimization point obtains has more preferable Speed and precision it is gentle.
The FLOPs of DF1 is higher than MobileNet V1, MobileNetV2, ShuffleNet V1 and ShuffleNet V2, DNA delay is but substantially reduced.This illustrates that indirect indexes (such as FLOPs) and direct indicator (for example, delay) are inconsistent. Accordingly, it is considered to which the characteristics of hardware and software, is come to reach speed and precision optimum balance be highly important.
Table one
In conclusion after reading this detailed disclosures, it will be understood by those skilled in the art that aforementioned detailed disclosure Content can be only presented in an illustrative manner, and can not be restrictive.Although not explicitly described or shown herein, this field skill Art personnel are understood that improve and modify it is intended to include the various reasonable changes to embodiment.These change, improve and It modifies and is intended to be proposed by the disclosure, and in the spirit and scope of the exemplary embodiment of the disclosure.
In addition, certain terms in the application have been used for describing implementation of the disclosure example.For example, " one embodiment ", " embodiment " and/or " some embodiments " means to combine the special characteristic of embodiment description, and structure or characteristic may include In at least one embodiment of the disclosure.Therefore, it can emphasize and it is to be understood that right in the various pieces of this specification Two or more references of " embodiment " or " one embodiment " or " alternate embodiment " are not necessarily all referring to identical implementation Example.In addition, special characteristic, structure or characteristic can be appropriately combined in one or more other embodiments of the present disclosure.
It should be appreciated that in the foregoing description of embodiment of the disclosure, in order to help to understand a feature, originally for simplification Disclosed purpose, the application sometimes combine various features in single embodiment, attached drawing or its description.Alternatively, the application is again Be by various characteristic dispersions in multiple the embodiment of the present invention.However, this be not to say that the combination of these features be it is necessary, Those skilled in the art are entirely possible to come out a portion feature extraction as individual when reading the application Embodiment understands.That is, embodiment in the application it can be appreciated that multiple secondary embodiments integration.And it is each The content of secondary embodiment is also to set up when being less than individually all features of aforementioned open embodiment.
In some embodiments, the quantity or property for certain embodiments of the application to be described and claimed as are expressed The number of matter is interpreted as in some cases through term " about ", " approximation " or " substantially " modification.For example, unless otherwise saying Bright, otherwise " about ", " approximation " or " substantially " can indicate ± 20% variation of the value of its description.Therefore, in some embodiments In, the numerical parameter listed in written description and the appended claims is approximation, can be tried according to specific embodiment Scheme the required property obtained and changes.In some embodiments, numerical parameter should be according to the quantity of the effective digital of report simultaneously It is explained by the common rounding-off technology of application.Although illustrating that some embodiments of the application list broad range of numerical value Range and parameter are approximations, but numerical value reported as precisely as possible is all listed in specific embodiment.
Herein cited each patent, patent application, the publication and other materials of patent application, such as article, books, Specification, publication, file, article etc. can be incorporated herein by reference.Full content for all purposes, in addition to Its relevant any prosecution file history, may or conflicting any identical or any possibility inconsistent with this document On any identical prosecution file history of the restrictive influence of the widest range of claim.Now or later and this document It is associated.For example, if in description, definition and/or the use of term associated with any included material and this The relevant term of document, description, definition and/or between there are it is any inconsistent or conflict when, be using the term in this document It is quasi-.
Finally, it is to be understood that the embodiment of application disclosed herein is the explanation to the principle of the embodiment of the application. Other modified embodiments are also within the scope of application.Therefore, herein disclosed embodiment it is merely exemplary rather than Limitation.Those skilled in the art can take alternative configuration according to the embodiment in the application to realize the invention in the application. Therefore, embodiments herein is not limited to which embodiment accurately described in application.

Claims (10)

1. the method for determination network structure precision and delay Optimization point that one kind is implemented on the computing device, which is characterized in that institute The method of stating includes:
Estimate the delay of multiple network structures, each network structure in the multiple network structure includes one or more mould Block, the delay of each network structure in the multiple network structure are one or more module described in the network structure The summation of delay;
Sub- search space is determined based on the delay of the multiple network structure;
Select at least one network structure as search result from the sub- search space.
2. determining the method for network structure precision and delay Optimization point as described in claim 1, which is characterized in that the estimation The delay of multiple network structures includes:
Passage capacity analysis obtains the delay of each module in each network structure in the multiple network structure.
3. determining the method for network structure precision and delay Optimization point as described in claim 1, which is characterized in that described from institute It states and selects at least one network structure in sub- search space as search result and include:
The first step, determines beta pruning set, and the beta pruning collection is combined into the collection of the network structure in the sub- search space needed to delete It closes;
Second step selects any network structure from the supplementary set of the relatively described sub- search space of the beta pruning set;
Third step, training any network structure and the precision for obtaining any network structure after training;
Training set is added in any network structure after training by the 4th step, and initial training collection is combined into empty set;
5th step selects at least one network structure as search result, the essence of described search result from the training set Degree is greater than training set middle part subnetwork structure, and the delay of the subnetwork structure is less than described search result;
The first step is repeated to the 5th step, until the difference of continuously determining at least twice search result is less than preset value or until repeating Number is equal to preset threshold.
4. determining the method for network structure precision and delay Optimization point as claimed in claim 3, which is characterized in that described first Step includes:
For it is described training set in each network structure,
Determine that the optimum network structure relative to each network structure, the optimum network structure are greater than the institute of threshold value for precision State the smallest network structure that is delayed in network structure;
Determine that the preceding value of each network structure, the preceding value finger widths and depth are respectively less than the network knot of each network structure Structure;
The beta pruning set, initial beta pruning is added in the preceding value that delay is greater than each network structure of the optimum network structure Collection is combined into empty set.
5. determining the method for network structure precision and delay Optimization point as claimed in claim 4, which is characterized in that it is each other The delay of the preceding value of network structure and precision are respectively less than other network structures.
6. determining the method for network structure precision and delay Optimization point as described in claim 1, which is characterized in that the method Further comprise:
Constructing the multiple network structure, the multiple network structure is backbone network structure, the of the backbone network structure One layer respectively includes one layer of convolutional layer with the second layer.
7. determining the method for network structure precision and delay Optimization point as described in claim 1, which is characterized in that the method Further comprise:
The multiple network structure is constructed, the multiple network structure is decoding network structure, and the decoding network structure includes The width control device connecting with backbone network structure, the width control device are the convolutional layer of 1*1.
8. a kind of system of determining network structure precision and delay Optimization point, comprising:
At least one storage equipment, the storage equipment include one group of instruction;And
At least one processor communicated at least one described storage equipment, wherein described when executing one group of instruction At least one processor is for making the system:
Estimate the delay of multiple network structures, each network structure in the multiple network structure includes one or more mould Block, the delay of each network structure in the multiple network structure are one or more module described in the network structure The summation of delay;
Sub- search space is determined based on the delay of the multiple network structure;
Select at least one network structure as search result from the sub- search space.
9. determining the system of network structure precision and delay Optimization point as claimed in claim 8, which is characterized in that in order to from institute It states and selects at least one network structure as search result in sub- search space, at least one described processor is for making the system System:
The first step, determines beta pruning set, and the beta pruning collection is combined into the collection of the network structure in the sub- search space needed to delete It closes;
Second step selects any network structure from the supplementary set of the relatively described sub- search space of the beta pruning set;
Third step, training any network structure and the precision for obtaining any network structure after training;
Training set is added in any network structure after training by the 4th step, and initial training collection is combined into empty set;
5th step selects at least one network structure as search result, the essence of described search result from the training set Degree is greater than training set middle part subnetwork structure, and the delay of the subnetwork structure is less than described search result;
The first step is repeated to the 5th step, until the difference of continuously determining at least twice search result is less than preset value or until repeating Number is equal to preset threshold.
10. determining the system of network structure precision and delay Optimization point as claimed in claim 9, which is characterized in that described the One step includes:
For it is described training set in each network structure,
Determine that the optimum network structure relative to each network structure, the optimum network structure are greater than the institute of threshold value for precision State the smallest network structure that is delayed in network structure;
Determine that the preceding value of each network structure, the preceding value finger widths and depth are respectively less than the network knot of each network structure Structure;
The beta pruning set, initial beta pruning is added in the preceding value that delay is greater than each network structure of the optimum network structure Collection is combined into empty set.
CN201910181390.XA 2019-03-11 2019-03-11 Method and device for determining network structure precision and delay optimization point Active CN109948795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910181390.XA CN109948795B (en) 2019-03-11 2019-03-11 Method and device for determining network structure precision and delay optimization point

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910181390.XA CN109948795B (en) 2019-03-11 2019-03-11 Method and device for determining network structure precision and delay optimization point

Publications (2)

Publication Number Publication Date
CN109948795A true CN109948795A (en) 2019-06-28
CN109948795B CN109948795B (en) 2021-12-14

Family

ID=67008704

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910181390.XA Active CN109948795B (en) 2019-03-11 2019-03-11 Method and device for determining network structure precision and delay optimization point

Country Status (1)

Country Link
CN (1) CN109948795B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659690A (en) * 2019-09-25 2020-01-07 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN111191785A (en) * 2019-12-20 2020-05-22 沈阳雅译网络技术有限公司 Structure searching method based on expanded search space
CN111353601A (en) * 2020-02-25 2020-06-30 北京百度网讯科技有限公司 Method and apparatus for predicting delay of model structure
WO2021114530A1 (en) * 2019-12-12 2021-06-17 Huawei Technologies Co., Ltd. Hardware platform specific operator fusion in machine learning
CN116522999A (en) * 2023-06-26 2023-08-01 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170578A (en) * 2007-11-30 2008-04-30 北京理工大学 Hierarchical peer-to-peer network structure and constructing method based on syntax similarity
CN108229657A (en) * 2017-12-25 2018-06-29 杭州健培科技有限公司 A kind of deep neural network training and optimization algorithm based on evolution algorithmic
CN108416423A (en) * 2017-02-10 2018-08-17 三星电子株式会社 Automatic threshold for neural network trimming and retraining
CN108875894A (en) * 2018-05-30 2018-11-23 吉林大学 Eliminate formula Stochastic search optimization method in subspace
CN109284820A (en) * 2018-10-26 2019-01-29 北京图森未来科技有限公司 A kind of search structure method and device of deep neural network
WO2019034129A1 (en) * 2017-08-18 2019-02-21 北京市商汤科技开发有限公司 Neural network structure generation method and device, electronic equipment and storage medium
CN109389221A (en) * 2018-10-31 2019-02-26 济南浪潮高新科技投资发展有限公司 A kind of neural network compression method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170578A (en) * 2007-11-30 2008-04-30 北京理工大学 Hierarchical peer-to-peer network structure and constructing method based on syntax similarity
CN108416423A (en) * 2017-02-10 2018-08-17 三星电子株式会社 Automatic threshold for neural network trimming and retraining
WO2019034129A1 (en) * 2017-08-18 2019-02-21 北京市商汤科技开发有限公司 Neural network structure generation method and device, electronic equipment and storage medium
CN108229657A (en) * 2017-12-25 2018-06-29 杭州健培科技有限公司 A kind of deep neural network training and optimization algorithm based on evolution algorithmic
CN108875894A (en) * 2018-05-30 2018-11-23 吉林大学 Eliminate formula Stochastic search optimization method in subspace
CN109284820A (en) * 2018-10-26 2019-01-29 北京图森未来科技有限公司 A kind of search structure method and device of deep neural network
CN109389221A (en) * 2018-10-31 2019-02-26 济南浪潮高新科技投资发展有限公司 A kind of neural network compression method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
J.D. DONG等: "Dpp-net: Device-aware progressive search for paretooptimal neural architectures", 《ECCV》 *
XIN LI: "Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search", 《ARXIV:1903.03777V1》 *
YANG T J等: "Netadapt: Platform-aware neural network adaptation for mobile applications", 《ECCV》 *
孙环龙等: "前馈神经网络结构新型剪枝算法研究", 《广西师范学院学报(自然科学版)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659690A (en) * 2019-09-25 2020-01-07 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN110659690B (en) * 2019-09-25 2022-04-05 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
WO2021114530A1 (en) * 2019-12-12 2021-06-17 Huawei Technologies Co., Ltd. Hardware platform specific operator fusion in machine learning
CN111191785A (en) * 2019-12-20 2020-05-22 沈阳雅译网络技术有限公司 Structure searching method based on expanded search space
CN111353601A (en) * 2020-02-25 2020-06-30 北京百度网讯科技有限公司 Method and apparatus for predicting delay of model structure
CN116522999A (en) * 2023-06-26 2023-08-01 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium
CN116522999B (en) * 2023-06-26 2023-12-15 深圳思谋信息科技有限公司 Model searching and time delay predictor training method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109948795B (en) 2021-12-14

Similar Documents

Publication Publication Date Title
CN109948795A (en) A kind of method and apparatus of determining network structure precision and delay Optimization point
CN110490309B (en) Operator fusion method for neural network and related product thereof
CN110175671B (en) Neural network construction method, image processing method and device
WO2021027153A1 (en) Method and apparatus for constructing traffic flow data analysis model
CN112541584B (en) Deep neural network model parallel mode selection method
CN108875815A (en) Feature Engineering variable determines method and device
CN115860081B (en) Core algorithm scheduling method, system, electronic equipment and storage medium
CN109189552A (en) Virtual network function dilatation and capacity reduction method and system
CN110378739B (en) Data traffic matching method and device
CN106649385B (en) Data reordering method and device based on HBase database
CN113886092A (en) Computation graph execution method and device and related equipment
CN112348188B (en) Model generation method and device, electronic device and storage medium
CN115543945B (en) Model compression method and device, storage medium and electronic equipment
CN108259583B (en) Data dynamic migration method and device
US20050137808A1 (en) Method for conceptualizing protein interaction networks using gene ontology
CN103778195B (en) Sorting reverse skyline query method in spatial database
CN115906927A (en) Data access analysis method and system based on artificial intelligence and cloud platform
CN112446459A (en) Data identification, model construction and training, and feature extraction method, system and equipment
CN104462422A (en) Object processing method and device
CN114529096A (en) Social network link prediction method and system based on ternary closure graph embedding
CN107908915A (en) Predict modeling and analysis method, the equipment and storage medium of tunnel crimp
CN111325343B (en) Neural network determination, target detection and intelligent driving control method and device
CN115982634A (en) Application program classification method and device, electronic equipment and computer program product
CN116325703A (en) Data format processing method and device
CN112364080A (en) Rapid retrieval system and method for massive vector library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant