CN110333857A - A kind of custom instruction automatic identifying method based on constraint planning - Google Patents
A kind of custom instruction automatic identifying method based on constraint planning Download PDFInfo
- Publication number
- CN110333857A CN110333857A CN201910627531.6A CN201910627531A CN110333857A CN 110333857 A CN110333857 A CN 110333857A CN 201910627531 A CN201910627531 A CN 201910627531A CN 110333857 A CN110333857 A CN 110333857A
- Authority
- CN
- China
- Prior art keywords
- custom instruction
- instruction
- custom
- node
- constraint
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/35—Creation or generation of source code model driven
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The present invention provides a kind of custom instruction automatic identifying method based on constraint planning, is related to EDA Technique field.This method includes the selection two parts of custom instruction enumerated with custom instruction;Enumerating for custom instruction enumerates constraint programming model by establish custom instruction, and all subgraphs for meeting constraint condition are enumerated from data flow diagram and are realized;The process respectively models constraint condition, and for problem is enumerated, seeks all custom instructions for meeting constraint condition using constraint programming method, custom instruction is enumerated in completion;The selection of custom instruction realizes multiple-objection optimization by establishing the selection constraint programming model of custom instruction;Multi-objective optimization question is converted to single-object problem and is realized by the process by establishing the maximization objective function that custom instruction bring processor performance is promoted with energy consumption reduction using the method based on weight.
Description
Technical field
The present invention relates to EDA Technique field more particularly to a kind of custom instructions based on constraint planning
Automatic identifying method.
Background technique
In recent years, in order to meet Embedded Application to high-performance and low-power consumption increasing need, expansion instruction set is wide
It is general be applied to embedded system in for example, dedicated instruction set processor (Application Specific Instruction
Processor, ASIP) the advantages of combining general processor and ASIC, in design cycle, flexibility, performance and power consumption etc.
Aspect provides good compromise.The custom instruction that extended instruction is concentrated is realized basic by encapsulating a series of elementary instructions
Chain and parallelization between instruction, and then improve performance.
Expansion instruction set towards specific application is the core link of dedicated instruction set processor design.Expansion instruction set is logical
Often used in the fields such as multimedia application processing and signal processing.In order to enable heterogeneous multi-processor preferably to run difference
Multimedia application, expansion instruction set is applied among heterogeneous multi-nucleus processor system on chip by Dammak et al., make be
System has carried out good tradeoff between performance and power consumption.Dedicated instruction set processor is used to execute number by Momcilovic et al.
According to adaptive motion estimation algorithm, data are greatly saved and calculate cost, improve the speed of video processing well.Sisto
Et al. propose one kind be exclusively used in automobile application field sensor signal conditioning application specific processor design.
Currently, field of image processing is quickly grown, and the effect of image procossing is also just continuous to be promoted.Neural network, support
Although the learning-oriented mechanism such as vector machine has preferable advantage in terms of image procossing, it is directed to current image data amount
Huge feature, some preferable optimization algorithms of effect but need a large amount of time to remove processing image data or training sample.
In addition, needing stringent time restriction for scan picture.Domestic and international current research discovery, expansion instruction set is applied to
It, being capable of significant ground improving performance in field of image processing.Mori et al. is proposed for accelerating Real-Time IP/CV algorithm dedicated
Processor designs .Edwards et al. by the way that among application extension instruction set to real-time target detection system, processing speed is improved
1.5 to 6.8 times.
It is efficiently to realize application program by design specialized chip, but special chip is set in the research of early stage
Meter period length, hardware development are difficult to debug, and cost is also very high.So more and more researchers also start that weight will be studied
The heart is transferred in extended instruction, automatically identifies expansion instruction set for specific application.
The process of automatic identification expansion instruction set is as shown in Figure 1, firstly, image processing algorithm source code is as open source compiling
The input of device GeCoS, GeCoS convert source code into control data flow diagram (Control Data Flow Graph, CDFG),
Controlling data flow diagram is the figure for indicating the data dependence relation between multiple basic blocks.Then, subgraph enumeration is from data flow
All subgraphs (graph-based that subgraph is custom instruction) for meeting constraint condition are enumerated in figure.Then, subgraph selects
Algorithm is from the best subgraph of selected section in the subgraph enumerated as final custom instruction.Finally, source code is converted
For the fresh code comprising selected custom instruction.
Constraint programming is a kind of universal search technology of combination reasoning from logic, is led originating from computer science and artificial intelligence
The constraint satisfaction problemx (Constraint Satisfaction Problem, CSP) in domain.Constraint satisfaction problemx is by given
(equation, inequality and program etc. all can serve as constraint item for one group of variable, the codomain of this group of variable and one group of constraint condition
Part) it is composed, the solution for constraint satisfaction problemx is to find out one or more in all combinations to meet constraint condition
Combination.In general, Combinatorial Optimization, Problems of Optimal Dispatch belong to constraint satisfaction problemx.When being solved the problems, such as using constraint programming,
Statement is closer to practical problem, without being linear equality or inequality by constraints conversion, keep formula expression simple and
It should be readily appreciated that.
Summary of the invention
The technical problem to be solved by the present invention is in view of the above shortcomings of the prior art, provide it is a kind of based on constraint planning
Custom instruction automatic identifying method carries out automatic identification to custom instruction based on constraint programming method.
In order to solve the above technical problems, the technical solution used in the present invention is: a kind of based on the customized of constraint planning
Instruct automatic identifying method, selection two parts enumerated with custom instruction including custom instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, from data flow diagram
All subgraphs for meeting constraint condition are enumerated to realize, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S=
(Vs, Es) be custom instruction example graph-based,I1, I2Respectively indicate the active node in figure G
The set of set and illegal node,
The data flow diagram G=(V, E) is a directed acyclic graph (Directed Acyclic Graph, i.e. DAG), knot
Point set V={ v1, v2..., vMIndicating elementary instruction, M is the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate the number on data flow diagram side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, custom instruction
Connectivity constraint condition, custom instruction is the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for problem is enumerated, asks all using constraint programming method and meets constraint condition
Custom instruction, completion custom instruction is enumerated;
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation this two
Kind elementary instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,Indicate node v and node vkBetween there are a undirected path, when enumerating separation subgraph
When this constraint can remove;
The custom instruction is that convex constraint condition is and if only if between any two the node u, v in subgraph s
Any path only pass through the node in subgraph s, to the constraint condition model, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate node v
In-degree and out-degree, Pred (u)={ v | v ∈ V, (v, u) ∈ E }, Succ (u)={ v | v ∈ V, (u, v) ∈ E } respectively indicate knot
The forerunner's node set and subsequent node set of point v, vin、voutRespectively indicate input, the output number of node v, mselIndicate knot
Whether point m is selected;
The selection of the custom instruction realizes that multiple target is excellent by establishing the selection constraint programming model of custom instruction
Change, method particularly includes: on the basis of the subgraph that custom instruction enumeration stage is enumerated, it is same that figure is carried out to all subgraphs first
Structure matching treatment: giving two subgraphs a and b, if a and b isomorphism, creation mode Ci, and using subgraph a and b as example
It is recorded in mode CiIn;The mode is the graph-based of candidate custom instruction;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N is custom instruction
The number for the candidate custom instruction that enumeration stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;It makes by oneself
Justice instruction CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j;
Custom instruction bring processor performance is promoted and custom instruction realizes required hardware face in custom feature unit
Integral does not use PiAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and place by encapsulating multiple elementary instructions
The number transmitted between reason device, to reduce the energy consumption of processor;Then custom instruction bring processor energy consumption is reduced most
Shown in the following formula of bigization objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges,
Indicate the reduction amount of instruction fetch number,Indicate the reduction amount of data transmission times, α, β are power
Weight parameter, alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, use
Multi-objective optimization question is converted to single-object problem by the method based on weight, is obtained customized shown in following formula
Instruct preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has
Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions when A designs for scalable processors,
AiFor hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one reality
Example is selected, then SiValue be 1, be otherwise 0, shown in following formula:
The beneficial effects of adopting the technical scheme are that it is provided by the invention it is a kind of based on constraint planning from
Definition instruction automatic identifying method, enumerates problem for custom instruction, by the modeling of problem and solves separation, is applicable to more
The combination of kind constraint condition, has preferable versatility and flexibility.For custom instruction select permeability, by establishing more mesh
Mark optimization constraint programming model is, it can be achieved that multiple-objection optimization;The custom instruction automatically identified of the invention is applied to figure
As processing class algorithm, the performance of algorithm can be obviously improved.
Detailed description of the invention
Fig. 1 is the automatic identification expansion instruction set flow chart towards image processing algorithm that background of invention provides;
Fig. 2 is the schematic diagram of data flow diagram provided in an embodiment of the present invention;
Fig. 3 is runing time comparison result figure under difference I/O constraint condition provided in an embodiment of the present invention;
Fig. 4 is that provided in an embodiment of the present invention enumerate and enumerates all subgraph runing time comparison result figures at connected subgraph;
Fig. 5 is performance boost comparison result figure provided in an embodiment of the present invention;
Fig. 6 is the comparison result figure provided in an embodiment of the present invention using distinct methods selection instruction number.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below
Example is not intended to limit the scope of the invention for illustrating the present invention.
It is a kind of based on constraint planning custom instruction automatic identifying method, including custom instruction enumerate with it is customized
Selection two parts of instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, from data flow diagram
All subgraphs for meeting constraint condition are enumerated to realize, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S=
(Vs, Es) is the graph-based of custom instruction example,I1, I2Respectively indicate the active node in figure G
Set and illegal node set,
The data flow diagram G=(V, E) is a directed acyclic graph (DirectedAcyclic Graph, i.e. DAG), such as
Shown in Fig. 2, nodal set V={ v1, v2..., vMIndicating elementary instruction, M is the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate the number on data flow diagram side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, custom instruction
Connectivity constraint condition, custom instruction is the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for problem is enumerated, asks all using constraint programming method and meets constraint condition
Custom instruction, completion custom instruction is enumerated;
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation this two
Kind elementary instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,It indicates between node v and node vk there are a undirected path, when enumerating separation subgraph
When this constraint can remove;
The custom instruction is that convex constraint condition is and if only if between any two the node u, v in subgraph s
Any path only pass through the node in subgraph s, to the constraint condition model, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
In the present embodiment, for data flow diagram as shown in Figure 2, subgraph { 1,2,3 } is convex portion figure, and subgraph { 2,3,5 }
It is not convex portion figure.
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate node v
In-degree and out-degree, Pred (u)={ v | v ∈ V, (v, u) ∈ E }, Succ (u)={ v | v ∈ V, (u, v) ∈ E } respectively indicate knot
The forerunner's node set and subsequent node set of point v, vin、voutRespectively indicate input, the output number of node v, mselIndicate knot
Whether point m is selected;
The selection of the custom instruction realizes that multiple target is excellent by establishing the selection constraint programming model of custom instruction
Change, method particularly includes: on the basis of the subgraph that custom instruction enumeration stage is enumerated, it is same that figure is carried out to all subgraphs first
Structure matching treatment: giving two subgraphs a and b, if a and b isomorphism, creation mode Ci, and using subgraph a and b as example
It is recorded in mode CiIn;The mode is the graph-based of candidate custom instruction;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N is custom instruction
The number for the candidate custom instruction that enumeration stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;It makes by oneself
Justice instruction CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j;
Custom instruction bring processor performance is promoted and custom instruction realizes required hardware face in custom feature unit
Integral does not use PiAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and place by encapsulating multiple elementary instructions
The number transmitted between reason device, to reduce the energy consumption of processor;Then custom instruction bring processor energy consumption is reduced most
Shown in the following formula of bigization objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges,
Indicate the reduction amount of instruction fetch number,Indicate the reduction amount of data transmission times, α, β are power
Weight parameter, alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, use
Multi-objective optimization question is converted to single-object problem by the method based on weight, is obtained customized shown in following formula
Instruct preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has
Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions when A designs for scalable processors,
AiFor hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one reality
Example is selected, then SiValue be 1, be otherwise 0, shown in following formula:
In the present embodiment, the environment of operation is i3-3240 3.4GHz processor, 4GB main memory, and operating system is
8. constraint programming tool of Windows is JaCop 2.3.Test benchmark collection derives from this implementation of MediaBench and MiBench.
Used test reference application program is common algorithms in field of image processing or in field of video processing in example.
In the present embodiment, for being directed to the common algorithms of field of image processing, GeCoS front-end compiler is used first, it will
Algorithm routine is converted to corresponding control data flow diagram.Then, using the custom instruction piece of the invention based on constraint programming
Act method enumerates all subgraphs for meeting constraint condition from data flow diagram.Custom instruction based on constraint programming method is enumerated
The results are shown in Table 1.Column Nodes, Enumerated Subgraphs and Time in table 1 respectively indicates benchmark program used
The nodal point number of corresponding data flow diagram, (the input and output upper limit is set the connected subgraph number for meeting constraint condition enumerated respectively
It 2) and the runing time of enumeration methodology is 6 and.
1 custom instruction enumeration result of table
In order to further analyze influence of the various boundary conditions to the runing time of enumeration methodology, in the present embodiment, compare
The runing time of enumeration methodology under different input and output constraint conditions.For benchmark SUSAN, JPEG
Encode, JPEG Decode and MESA, under different I/O constraint conditions, runing time result is more as shown in Figure 3.
From figure 3, it can be seen that the runing time of enumeration methodology is dramatically increased with the increase of input and output number.Pass through
Further it was found that, increasing output number influences significantly greater than to increase input number to runing time to runing time
It influences.For example, under conditions of being 6/2 compared to the input and output upper limit, when the input and output upper limit is 7/2, the fortune of enumeration methodology
The row time averagely increases by 1.5 times, and when the input and output upper limit is set as 6/3, the runing time of enumeration methodology averagely increases by 10 times.
Connectivity due to enumerating subgraph is an important constraint condition in custom instruction enumeration process.The present embodiment
In, by the runing time for only enumerating connected subgraph and the runing time for enumerating all subgraphs (including connected subgraph and separation subgraph)
It compares, as a result as shown in Figure 4 (I/O condition is 6/2).It can be seen from the figure that enumerating the runing time of all subgraphs
Significantly larger than only enumerate the runing time of connected subgraph.
In the present embodiment, the custom instruction selection method of the invention based on constraint programming and Kamal et al. are proposed
Custom instruction selection method and the custom instruction selection method that proposes of Xiao et al. be compared.Wherein, Kamal et al.
The method of proposition is to select the maximized custom instruction of improving performance under the conditions of giving area-constrained.Xiao et al. is proposed
Method be under the conditions of given area-constrained, by selecting less custom instruction number, to reduce power consumption
In the present embodiment, according to the method that Kamal et al. is proposed, the hardware custom feature unit of custom instruction realization
The hardware delay and area information of the elementary instruction of middle realization are as shown in table 2.
The hardware delay and area information of elementary instruction in 2 custom feature unit of table
Operation | Area | Delay(ns) |
SUB | 225 | 0.5 |
Add | 200 | 0.5 |
SHR/SHL | 326 | 0.19 |
EQT/NEQ | 87 | 0.16 |
GRT/LKS | 115 | 0.21 |
AND | 41 | 0.04 |
OR | 42 | 0.05 |
XOR | 64 | 0.05 |
In the present embodiment, it is assumed that the custom instruction comprising multiple nodes executes on custom feature unit, and applies
The elementary instruction for not being included in custom instruction in program executes formula (13) on reference processor and gives using customized
The calculating of the overall delay of the application program of instruction:
Lh=(∑S∈SC∑i∈C(S)HW(i)+∑S∈SCT(S))+∑K∈PSW(K) (13)
Wherein, HW (i) indicates the hardware delay of custom instruction i.T (S) indicates the transmission input of custom instruction and defeated
Extra latency needed for operand out.∑ in formula (13)S∈SC∑i∈C(S)When HW (i) indicates selected custom instruction accumulation hardware
(SC indicates the set of selected custom instruction to the summation prolonged, and C (S) indicates it is in the critical path of selected custom instruction s
Node);Part 2 indicates the accumulation software time delay not comprising the elementary instruction into custom instruction, and wherein P expression does not include
Elementary instruction set.
Shown in calculating such as formula (14) by using the performance boost of custom instruction realization:
Wherein,It is accumulation software time delay (the n expression of all elementary instructions in the source code of original application program
The quantity of elementary instruction in source code).
In the present embodiment, by custom instruction selection method and Kamal of the invention et al. and Xiao et al. propose from
Instruction method is defined to compare.Wherein, area-constrained condition is set to 10%, 30% and 50%. ginseng of area of reference
Examine custom instruction area that area is the selected maximization improving performance of greedy algorithm proposed using Bonzini et al. it
With for the 9 benchmark Benchmarks enumerated in table 1, three kinds of selected number of instructions of method (NS) and property
The comparison result that (PI) can be promoted is as shown in table 3.
3 custom instruction selection method Comparison of experiment results of table
In the present embodiment, parameter γ, ε, α and β in Model for Multi-Objective Optimization of the invention are set as 0.5.It can observe
It arrives, with loosening for area-constrained condition, the performance boost that three kinds of methods obtain is in increased trend.Compared to Xiao et al.
The method of proposition, method of the invention performing better than in terms of performance boost: the method for the present invention obtains performance boost average out to
3.12 times, the method that Xiao et al. is proposed obtains 2.81 times of performance boost average out to.On the other hand, the method for the present invention is selected
The number of custom instruction example is considerably less than the number of the selected custom instruction example of method of Kamal et al. proposition.
The instruction number average out to 58 of the method for the present invention final choice, and the number of instructions for the method final choice that Kamal et al. is proposed is flat
It is 62.Number due to reducing custom instruction example can reduce final instruction fetch and data between register and processor
The number of transmission, to reduce energy consumption.
In addition, by adjusting parameter γ and ε in Model for Multi-Objective Optimization, the method for the present invention can in terms of performance boost or
There is preferably performance in terms of reducing number of instructions.When parameter γ and ε are set to 1 and 0, proposed compared to Kamal et al.
Method, the method for the present invention in performance boost effect advantageously, as a result as shown in Figure 5 (area-constrained is 50%).Work as ginseng
When number γ and ε is set to 1 and 0, problem model is translated under the conditions of giving area-constrained, asks improving performance maximized
Custom instruction selection.Since the constraint programming method that the present invention uses can be used to find optimal solution, and Kamal et al. is proposed
Method cannot be guaranteed that the solution obtained is optimal.Therefore, the method for the present invention becomes apparent from performance boost effect.
When parameter γ and ε are set to 0 and 1, compared to the method that Xiao et al. is proposed, the finger of the method for the present invention selection
Enable example number less, as a result as shown in Figure 6.When parameter γ and ε are set to 0 and 1, problem model is translated into given
Under the conditions of area-constrained, the Command Example of minimal number is selected to cover former data flow diagram.For each test benchmark program, constraint
Programmed method can choose the instruction of minimal number, and the heuristic method that Xiao et al. is proposed is in most cases, cannot look for
To the instruction of minimal number.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify to technical solution documented by previous embodiment, or some or all of the technical features are equal
Replacement;And these are modified or replaceed, model defined by the claims in the present invention that it does not separate the essence of the corresponding technical solution
It encloses.
Claims (3)
1. a kind of custom instruction automatic identifying method based on constraint planning, it is characterised in that: piece including custom instruction
Lift selection two parts with custom instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, enumerates from data flow diagram
All subgraphs realizations for meeting constraint condition, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S=(Vs, Es)
It is the graph-based of custom instruction example,I1, I2Respectively indicate the active node in figure G set and
The set of illegal node,
The data flow diagram G=(V, E) is a directed acyclic graph, nodal set V={ v1, v2..., vMIndicate elementary instruction, M
For the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate number
According to the number on flow graph side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, the company of custom instruction
General character constraint condition, custom instruction are the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for enumerating problem, using constraint programming method ask it is all meet constraint condition from
Custom instruction is enumerated in definition instruction, completion;
The selection of the custom instruction realizes multiple-objection optimization by establishing the selection constraint programming model of custom instruction,
Method particularly includes:
On the basis of the subgraph that custom instruction enumeration stage is enumerated, all subgraphs are carried out at isomorphism of graph matching first
Reason;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N enumerates for custom instruction
The number for the candidate custom instruction that stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;Customized finger
Enable CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j;It makes by oneself
Justice instruction bring processor performance is promoted and custom instruction realizes required hardware area point in custom feature unit
P is not usediAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and processor by encapsulating multiple elementary instructions
Between the number that transmits, to reduce the energy consumption of processor;The then maximization of custom instruction bring processor energy consumption reduction
Shown in the following formula of objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges,Expression takes
The reduction amount of number of instructions,Indicating the reduction amount of data transmission times, α, β are weight parameter,
Alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, using being based on
Multi-objective optimization question is converted to single-object problem by the method for weight, obtains custom instruction shown in following formula
Preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has area
Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions, A when A designs for scalable processorsiFor
Hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one example quilt
Selection, then SiValue be 1, be otherwise 0, shown in following formula:
2. a kind of custom instruction automatic identifying method based on constraint planning according to claim 1, it is characterised in that:
It is described to constraint condition modeling method particularly includes:
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
vsel=0
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation both bases
This instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,Indicate node v and node vkBetween there are a undirected path, when enumerating separation subgraph, this is about
Beam can remove;
The custom instruction is that convex constraint condition is and if only if appointing between any two the node u, v in subgraph s
The node in subgraph s is only passed through in what path, models to the constraint condition, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate entering for node v
Degree and out-degree, Pred (u)=v | v ∈ V, (v, u) ∈ E }, Succ (u)=v | v ∈ V, (u, v) ∈ E } respectively indicate node v's
Forerunner's node set and subsequent node set, vin、voutRespectively indicate input, the output number of node v, mselIndicating node m is
It is no to be selected.
3. a kind of custom instruction automatic identifying method based on constraint planning according to claim 1, it is characterised in that:
Described pair of all subgraphs carry out isomorphism of graph matching treatment method particularly includes:
Two subgraphs a and b are given, if a and b isomorphism, creation mode Ci, and be recorded in using subgraph a and b as example
Mode CiIn;The mode is the graph-based of candidate custom instruction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910627531.6A CN110333857B (en) | 2019-07-12 | 2019-07-12 | Automatic user-defined instruction identification method based on constraint programming |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910627531.6A CN110333857B (en) | 2019-07-12 | 2019-07-12 | Automatic user-defined instruction identification method based on constraint programming |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110333857A true CN110333857A (en) | 2019-10-15 |
CN110333857B CN110333857B (en) | 2023-03-14 |
Family
ID=68146500
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910627531.6A Active CN110333857B (en) | 2019-07-12 | 2019-07-12 | Automatic user-defined instruction identification method based on constraint programming |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110333857B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113296788A (en) * | 2021-06-10 | 2021-08-24 | 上海东软载波微电子有限公司 | Instruction scheduling method, apparatus, device, storage medium and program product |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030014742A1 (en) * | 2001-07-09 | 2003-01-16 | Sasken Communication Technologies Limited | Technique for compiling computer code to reduce energy consumption while executing the code |
CN102929580A (en) * | 2012-11-06 | 2013-02-13 | 无锡江南计算技术研究所 | Partitioning method and device of digit group multi-reference access |
CN103995540A (en) * | 2014-05-22 | 2014-08-20 | 哈尔滨工业大学 | Method for rapidly generating finite time track of hypersonic aircraft |
CN105138601A (en) * | 2015-08-06 | 2015-12-09 | 中国科学院软件研究所 | Graph pattern matching method for supporting fuzzy constraint relation |
CN105335129A (en) * | 2014-06-23 | 2016-02-17 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN107870780A (en) * | 2016-09-28 | 2018-04-03 | 华为技术有限公司 | Data processing equipment and method |
US20180196673A1 (en) * | 2015-07-31 | 2018-07-12 | Arm Limited | Vector length querying instruction |
US20180300148A1 (en) * | 2017-04-12 | 2018-10-18 | Arm Limited | Apparatus and method for determining a recovery point from which to resume instruction execution following handling of an unexpected change in instruction flow |
-
2019
- 2019-07-12 CN CN201910627531.6A patent/CN110333857B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030014742A1 (en) * | 2001-07-09 | 2003-01-16 | Sasken Communication Technologies Limited | Technique for compiling computer code to reduce energy consumption while executing the code |
CN102929580A (en) * | 2012-11-06 | 2013-02-13 | 无锡江南计算技术研究所 | Partitioning method and device of digit group multi-reference access |
CN103995540A (en) * | 2014-05-22 | 2014-08-20 | 哈尔滨工业大学 | Method for rapidly generating finite time track of hypersonic aircraft |
CN105335129A (en) * | 2014-06-23 | 2016-02-17 | 联想(北京)有限公司 | Information processing method and electronic equipment |
US20180196673A1 (en) * | 2015-07-31 | 2018-07-12 | Arm Limited | Vector length querying instruction |
CN105138601A (en) * | 2015-08-06 | 2015-12-09 | 中国科学院软件研究所 | Graph pattern matching method for supporting fuzzy constraint relation |
CN107870780A (en) * | 2016-09-28 | 2018-04-03 | 华为技术有限公司 | Data processing equipment and method |
US20180300148A1 (en) * | 2017-04-12 | 2018-10-18 | Arm Limited | Apparatus and method for determining a recovery point from which to resume instruction execution following handling of an unexpected change in instruction flow |
Non-Patent Citations (3)
Title |
---|
B. CHAKRABORTY,等: "Handling Constraints in Multi-Objective GA for Embedded System Design", 《19TH INTERNATIONAL CONFERENCE ON VLSI DESIGN HELD JOINTLY WITH 5TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS DESIGN (VLSID"06)》 * |
肖成龙,等: "面向高层次综合的自定义指令自动识别方法", 《计算机应用》 * |
龚爱慧,等: "CSPack:采用CSP图匹配的新型装箱算法", 《计算机辅助设计与图形学学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113296788A (en) * | 2021-06-10 | 2021-08-24 | 上海东软载波微电子有限公司 | Instruction scheduling method, apparatus, device, storage medium and program product |
CN113296788B (en) * | 2021-06-10 | 2024-04-12 | 上海东软载波微电子有限公司 | Instruction scheduling method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110333857B (en) | 2023-03-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | Semantic object parsing with graph lstm | |
Lopez-Novoa et al. | A survey of performance modeling and simulation techniques for accelerator-based computing | |
JP6763072B2 (en) | Compile data processing graph | |
Zheng et al. | A fabric defect detection method based on improved yolov5 | |
Chen et al. | Tree structured analysis on GPU power study | |
Hammond et al. | Automatic skeletons in template haskell | |
Zhou et al. | Towards the co-design of neural networks and accelerators | |
Morton et al. | Grammar filtering for syntax-guided synthesis | |
Ney et al. | HALF: Holistic auto machine learning for FPGAs | |
CN110333857A (en) | A kind of custom instruction automatic identifying method based on constraint planning | |
Dutta et al. | Pattern-based autotuning of openmp loops using graph neural networks | |
CN112434785B (en) | Distributed parallel deep neural network performance evaluation method for supercomputer | |
Liang et al. | DeGTeC: a deep graph-temporal clustering framework for data-parallel job characterization in data centers | |
Deniz et al. | Using machine learning techniques to detect parallel patterns of multi-threaded applications | |
CN110377525A (en) | A kind of parallel program property-predication system based on feature and machine learning when running | |
Wang et al. | An automatic-addressing architecture with fully serialized access in racetrack memory for energy-efficient CNNs | |
Pedrycz et al. | Using self-organizing maps to analyze object-oriented software measures | |
Deng et al. | Darwin-s: A reference software architecture for brain-inspired computers | |
Zhou et al. | Implementation of hierarchical temporal memory on a many-core architecture | |
Xiao et al. | Parallel custom instruction identification for extensible processors | |
Mathew et al. | A characterization of visual feature recognition | |
Al-Obaidy et al. | Power-Aware Computing on GPGPU Systems Using ML Classification Techniques | |
Wang et al. | Loop Kernel Pipelining Mapping onto Coarse-Grained Reconfigurable Architecture for Data-Intensive Applications. | |
Zhang et al. | A hybrid deep neural network for the prediction of in-vivo protein-DNA binding by combining multiple-instance learning | |
Koshulko et al. | Adaptive parallel implementation of the Combinatorial GMDH algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |