CN110333857A - A kind of custom instruction automatic identifying method based on constraint planning - Google Patents

A kind of custom instruction automatic identifying method based on constraint planning Download PDF

Info

Publication number
CN110333857A
CN110333857A CN201910627531.6A CN201910627531A CN110333857A CN 110333857 A CN110333857 A CN 110333857A CN 201910627531 A CN201910627531 A CN 201910627531A CN 110333857 A CN110333857 A CN 110333857A
Authority
CN
China
Prior art keywords
custom instruction
instruction
custom
node
constraint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910627531.6A
Other languages
Chinese (zh)
Other versions
CN110333857B (en
Inventor
肖成龙
王珊珊
王心霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning Technical University
Original Assignee
Liaoning Technical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning Technical University filed Critical Liaoning Technical University
Priority to CN201910627531.6A priority Critical patent/CN110333857B/en
Publication of CN110333857A publication Critical patent/CN110333857A/en
Application granted granted Critical
Publication of CN110333857B publication Critical patent/CN110333857B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/35Creation or generation of source code model driven
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The present invention provides a kind of custom instruction automatic identifying method based on constraint planning, is related to EDA Technique field.This method includes the selection two parts of custom instruction enumerated with custom instruction;Enumerating for custom instruction enumerates constraint programming model by establish custom instruction, and all subgraphs for meeting constraint condition are enumerated from data flow diagram and are realized;The process respectively models constraint condition, and for problem is enumerated, seeks all custom instructions for meeting constraint condition using constraint programming method, custom instruction is enumerated in completion;The selection of custom instruction realizes multiple-objection optimization by establishing the selection constraint programming model of custom instruction;Multi-objective optimization question is converted to single-object problem and is realized by the process by establishing the maximization objective function that custom instruction bring processor performance is promoted with energy consumption reduction using the method based on weight.

Description

A kind of custom instruction automatic identifying method based on constraint planning
Technical field
The present invention relates to EDA Technique field more particularly to a kind of custom instructions based on constraint planning Automatic identifying method.
Background technique
In recent years, in order to meet Embedded Application to high-performance and low-power consumption increasing need, expansion instruction set is wide It is general be applied to embedded system in for example, dedicated instruction set processor (Application Specific Instruction Processor, ASIP) the advantages of combining general processor and ASIC, in design cycle, flexibility, performance and power consumption etc. Aspect provides good compromise.The custom instruction that extended instruction is concentrated is realized basic by encapsulating a series of elementary instructions Chain and parallelization between instruction, and then improve performance.
Expansion instruction set towards specific application is the core link of dedicated instruction set processor design.Expansion instruction set is logical Often used in the fields such as multimedia application processing and signal processing.In order to enable heterogeneous multi-processor preferably to run difference Multimedia application, expansion instruction set is applied among heterogeneous multi-nucleus processor system on chip by Dammak et al., make be System has carried out good tradeoff between performance and power consumption.Dedicated instruction set processor is used to execute number by Momcilovic et al. According to adaptive motion estimation algorithm, data are greatly saved and calculate cost, improve the speed of video processing well.Sisto Et al. propose one kind be exclusively used in automobile application field sensor signal conditioning application specific processor design.
Currently, field of image processing is quickly grown, and the effect of image procossing is also just continuous to be promoted.Neural network, support Although the learning-oriented mechanism such as vector machine has preferable advantage in terms of image procossing, it is directed to current image data amount Huge feature, some preferable optimization algorithms of effect but need a large amount of time to remove processing image data or training sample. In addition, needing stringent time restriction for scan picture.Domestic and international current research discovery, expansion instruction set is applied to It, being capable of significant ground improving performance in field of image processing.Mori et al. is proposed for accelerating Real-Time IP/CV algorithm dedicated Processor designs .Edwards et al. by the way that among application extension instruction set to real-time target detection system, processing speed is improved 1.5 to 6.8 times.
It is efficiently to realize application program by design specialized chip, but special chip is set in the research of early stage Meter period length, hardware development are difficult to debug, and cost is also very high.So more and more researchers also start that weight will be studied The heart is transferred in extended instruction, automatically identifies expansion instruction set for specific application.
The process of automatic identification expansion instruction set is as shown in Figure 1, firstly, image processing algorithm source code is as open source compiling The input of device GeCoS, GeCoS convert source code into control data flow diagram (Control Data Flow Graph, CDFG), Controlling data flow diagram is the figure for indicating the data dependence relation between multiple basic blocks.Then, subgraph enumeration is from data flow All subgraphs (graph-based that subgraph is custom instruction) for meeting constraint condition are enumerated in figure.Then, subgraph selects Algorithm is from the best subgraph of selected section in the subgraph enumerated as final custom instruction.Finally, source code is converted For the fresh code comprising selected custom instruction.
Constraint programming is a kind of universal search technology of combination reasoning from logic, is led originating from computer science and artificial intelligence The constraint satisfaction problemx (Constraint Satisfaction Problem, CSP) in domain.Constraint satisfaction problemx is by given (equation, inequality and program etc. all can serve as constraint item for one group of variable, the codomain of this group of variable and one group of constraint condition Part) it is composed, the solution for constraint satisfaction problemx is to find out one or more in all combinations to meet constraint condition Combination.In general, Combinatorial Optimization, Problems of Optimal Dispatch belong to constraint satisfaction problemx.When being solved the problems, such as using constraint programming, Statement is closer to practical problem, without being linear equality or inequality by constraints conversion, keep formula expression simple and It should be readily appreciated that.
Summary of the invention
The technical problem to be solved by the present invention is in view of the above shortcomings of the prior art, provide it is a kind of based on constraint planning Custom instruction automatic identifying method carries out automatic identification to custom instruction based on constraint programming method.
In order to solve the above technical problems, the technical solution used in the present invention is: a kind of based on the customized of constraint planning Instruct automatic identifying method, selection two parts enumerated with custom instruction including custom instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, from data flow diagram All subgraphs for meeting constraint condition are enumerated to realize, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S= (Vs, Es) be custom instruction example graph-based,I1, I2Respectively indicate the active node in figure G The set of set and illegal node,
The data flow diagram G=(V, E) is a directed acyclic graph (Directed Acyclic Graph, i.e. DAG), knot Point set V={ v1, v2..., vMIndicating elementary instruction, M is the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate the number on data flow diagram side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, custom instruction Connectivity constraint condition, custom instruction is the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for problem is enumerated, asks all using constraint programming method and meets constraint condition Custom instruction, completion custom instruction is enumerated;
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation this two Kind elementary instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,Indicate node v and node vkBetween there are a undirected path, when enumerating separation subgraph When this constraint can remove;
The custom instruction is that convex constraint condition is and if only if between any two the node u, v in subgraph s Any path only pass through the node in subgraph s, to the constraint condition model, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate node v In-degree and out-degree, Pred (u)={ v | v ∈ V, (v, u) ∈ E }, Succ (u)={ v | v ∈ V, (u, v) ∈ E } respectively indicate knot The forerunner's node set and subsequent node set of point v, vin、voutRespectively indicate input, the output number of node v, mselIndicate knot Whether point m is selected;
The selection of the custom instruction realizes that multiple target is excellent by establishing the selection constraint programming model of custom instruction Change, method particularly includes: on the basis of the subgraph that custom instruction enumeration stage is enumerated, it is same that figure is carried out to all subgraphs first Structure matching treatment: giving two subgraphs a and b, if a and b isomorphism, creation mode Ci, and using subgraph a and b as example It is recorded in mode CiIn;The mode is the graph-based of candidate custom instruction;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N is custom instruction The number for the candidate custom instruction that enumeration stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;It makes by oneself Justice instruction CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j; Custom instruction bring processor performance is promoted and custom instruction realizes required hardware face in custom feature unit Integral does not use PiAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and place by encapsulating multiple elementary instructions The number transmitted between reason device, to reduce the energy consumption of processor;Then custom instruction bring processor energy consumption is reduced most Shown in the following formula of bigization objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges, Indicate the reduction amount of instruction fetch number,Indicate the reduction amount of data transmission times, α, β are power Weight parameter, alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, use Multi-objective optimization question is converted to single-object problem by the method based on weight, is obtained customized shown in following formula Instruct preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions when A designs for scalable processors, AiFor hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one reality Example is selected, then SiValue be 1, be otherwise 0, shown in following formula:
The beneficial effects of adopting the technical scheme are that it is provided by the invention it is a kind of based on constraint planning from Definition instruction automatic identifying method, enumerates problem for custom instruction, by the modeling of problem and solves separation, is applicable to more The combination of kind constraint condition, has preferable versatility and flexibility.For custom instruction select permeability, by establishing more mesh Mark optimization constraint programming model is, it can be achieved that multiple-objection optimization;The custom instruction automatically identified of the invention is applied to figure As processing class algorithm, the performance of algorithm can be obviously improved.
Detailed description of the invention
Fig. 1 is the automatic identification expansion instruction set flow chart towards image processing algorithm that background of invention provides;
Fig. 2 is the schematic diagram of data flow diagram provided in an embodiment of the present invention;
Fig. 3 is runing time comparison result figure under difference I/O constraint condition provided in an embodiment of the present invention;
Fig. 4 is that provided in an embodiment of the present invention enumerate and enumerates all subgraph runing time comparison result figures at connected subgraph;
Fig. 5 is performance boost comparison result figure provided in an embodiment of the present invention;
Fig. 6 is the comparison result figure provided in an embodiment of the present invention using distinct methods selection instruction number.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below Example is not intended to limit the scope of the invention for illustrating the present invention.
It is a kind of based on constraint planning custom instruction automatic identifying method, including custom instruction enumerate with it is customized Selection two parts of instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, from data flow diagram All subgraphs for meeting constraint condition are enumerated to realize, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S= (Vs, Es) is the graph-based of custom instruction example,I1, I2Respectively indicate the active node in figure G Set and illegal node set,
The data flow diagram G=(V, E) is a directed acyclic graph (DirectedAcyclic Graph, i.e. DAG), such as Shown in Fig. 2, nodal set V={ v1, v2..., vMIndicating elementary instruction, M is the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate the number on data flow diagram side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, custom instruction Connectivity constraint condition, custom instruction is the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for problem is enumerated, asks all using constraint programming method and meets constraint condition Custom instruction, completion custom instruction is enumerated;
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation this two Kind elementary instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,It indicates between node v and node vk there are a undirected path, when enumerating separation subgraph When this constraint can remove;
The custom instruction is that convex constraint condition is and if only if between any two the node u, v in subgraph s Any path only pass through the node in subgraph s, to the constraint condition model, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
In the present embodiment, for data flow diagram as shown in Figure 2, subgraph { 1,2,3 } is convex portion figure, and subgraph { 2,3,5 } It is not convex portion figure.
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate node v In-degree and out-degree, Pred (u)={ v | v ∈ V, (v, u) ∈ E }, Succ (u)={ v | v ∈ V, (u, v) ∈ E } respectively indicate knot The forerunner's node set and subsequent node set of point v, vin、voutRespectively indicate input, the output number of node v, mselIndicate knot Whether point m is selected;
The selection of the custom instruction realizes that multiple target is excellent by establishing the selection constraint programming model of custom instruction Change, method particularly includes: on the basis of the subgraph that custom instruction enumeration stage is enumerated, it is same that figure is carried out to all subgraphs first Structure matching treatment: giving two subgraphs a and b, if a and b isomorphism, creation mode Ci, and using subgraph a and b as example It is recorded in mode CiIn;The mode is the graph-based of candidate custom instruction;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N is custom instruction The number for the candidate custom instruction that enumeration stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;It makes by oneself Justice instruction CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j; Custom instruction bring processor performance is promoted and custom instruction realizes required hardware face in custom feature unit Integral does not use PiAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and place by encapsulating multiple elementary instructions The number transmitted between reason device, to reduce the energy consumption of processor;Then custom instruction bring processor energy consumption is reduced most Shown in the following formula of bigization objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges, Indicate the reduction amount of instruction fetch number,Indicate the reduction amount of data transmission times, α, β are power Weight parameter, alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, use Multi-objective optimization question is converted to single-object problem by the method based on weight, is obtained customized shown in following formula Instruct preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions when A designs for scalable processors, AiFor hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one reality Example is selected, then SiValue be 1, be otherwise 0, shown in following formula:
In the present embodiment, the environment of operation is i3-3240 3.4GHz processor, 4GB main memory, and operating system is 8. constraint programming tool of Windows is JaCop 2.3.Test benchmark collection derives from this implementation of MediaBench and MiBench. Used test reference application program is common algorithms in field of image processing or in field of video processing in example.
In the present embodiment, for being directed to the common algorithms of field of image processing, GeCoS front-end compiler is used first, it will Algorithm routine is converted to corresponding control data flow diagram.Then, using the custom instruction piece of the invention based on constraint programming Act method enumerates all subgraphs for meeting constraint condition from data flow diagram.Custom instruction based on constraint programming method is enumerated The results are shown in Table 1.Column Nodes, Enumerated Subgraphs and Time in table 1 respectively indicates benchmark program used The nodal point number of corresponding data flow diagram, (the input and output upper limit is set the connected subgraph number for meeting constraint condition enumerated respectively It 2) and the runing time of enumeration methodology is 6 and.
1 custom instruction enumeration result of table
In order to further analyze influence of the various boundary conditions to the runing time of enumeration methodology, in the present embodiment, compare The runing time of enumeration methodology under different input and output constraint conditions.For benchmark SUSAN, JPEG Encode, JPEG Decode and MESA, under different I/O constraint conditions, runing time result is more as shown in Figure 3.
From figure 3, it can be seen that the runing time of enumeration methodology is dramatically increased with the increase of input and output number.Pass through Further it was found that, increasing output number influences significantly greater than to increase input number to runing time to runing time It influences.For example, under conditions of being 6/2 compared to the input and output upper limit, when the input and output upper limit is 7/2, the fortune of enumeration methodology The row time averagely increases by 1.5 times, and when the input and output upper limit is set as 6/3, the runing time of enumeration methodology averagely increases by 10 times.
Connectivity due to enumerating subgraph is an important constraint condition in custom instruction enumeration process.The present embodiment In, by the runing time for only enumerating connected subgraph and the runing time for enumerating all subgraphs (including connected subgraph and separation subgraph) It compares, as a result as shown in Figure 4 (I/O condition is 6/2).It can be seen from the figure that enumerating the runing time of all subgraphs Significantly larger than only enumerate the runing time of connected subgraph.
In the present embodiment, the custom instruction selection method of the invention based on constraint programming and Kamal et al. are proposed Custom instruction selection method and the custom instruction selection method that proposes of Xiao et al. be compared.Wherein, Kamal et al. The method of proposition is to select the maximized custom instruction of improving performance under the conditions of giving area-constrained.Xiao et al. is proposed Method be under the conditions of given area-constrained, by selecting less custom instruction number, to reduce power consumption
In the present embodiment, according to the method that Kamal et al. is proposed, the hardware custom feature unit of custom instruction realization The hardware delay and area information of the elementary instruction of middle realization are as shown in table 2.
The hardware delay and area information of elementary instruction in 2 custom feature unit of table
Operation Area Delay(ns)
SUB 225 0.5
Add 200 0.5
SHR/SHL 326 0.19
EQT/NEQ 87 0.16
GRT/LKS 115 0.21
AND 41 0.04
OR 42 0.05
XOR 64 0.05
In the present embodiment, it is assumed that the custom instruction comprising multiple nodes executes on custom feature unit, and applies The elementary instruction for not being included in custom instruction in program executes formula (13) on reference processor and gives using customized The calculating of the overall delay of the application program of instruction:
Lh=(∑S∈SCi∈C(S)HW(i)+∑S∈SCT(S))+∑K∈PSW(K) (13)
Wherein, HW (i) indicates the hardware delay of custom instruction i.T (S) indicates the transmission input of custom instruction and defeated Extra latency needed for operand out.∑ in formula (13)S∈SCi∈C(S)When HW (i) indicates selected custom instruction accumulation hardware (SC indicates the set of selected custom instruction to the summation prolonged, and C (S) indicates it is in the critical path of selected custom instruction s Node);Part 2 indicates the accumulation software time delay not comprising the elementary instruction into custom instruction, and wherein P expression does not include Elementary instruction set.
Shown in calculating such as formula (14) by using the performance boost of custom instruction realization:
Wherein,It is accumulation software time delay (the n expression of all elementary instructions in the source code of original application program The quantity of elementary instruction in source code).
In the present embodiment, by custom instruction selection method and Kamal of the invention et al. and Xiao et al. propose from Instruction method is defined to compare.Wherein, area-constrained condition is set to 10%, 30% and 50%. ginseng of area of reference Examine custom instruction area that area is the selected maximization improving performance of greedy algorithm proposed using Bonzini et al. it With for the 9 benchmark Benchmarks enumerated in table 1, three kinds of selected number of instructions of method (NS) and property The comparison result that (PI) can be promoted is as shown in table 3.
3 custom instruction selection method Comparison of experiment results of table
In the present embodiment, parameter γ, ε, α and β in Model for Multi-Objective Optimization of the invention are set as 0.5.It can observe It arrives, with loosening for area-constrained condition, the performance boost that three kinds of methods obtain is in increased trend.Compared to Xiao et al. The method of proposition, method of the invention performing better than in terms of performance boost: the method for the present invention obtains performance boost average out to 3.12 times, the method that Xiao et al. is proposed obtains 2.81 times of performance boost average out to.On the other hand, the method for the present invention is selected The number of custom instruction example is considerably less than the number of the selected custom instruction example of method of Kamal et al. proposition. The instruction number average out to 58 of the method for the present invention final choice, and the number of instructions for the method final choice that Kamal et al. is proposed is flat It is 62.Number due to reducing custom instruction example can reduce final instruction fetch and data between register and processor The number of transmission, to reduce energy consumption.
In addition, by adjusting parameter γ and ε in Model for Multi-Objective Optimization, the method for the present invention can in terms of performance boost or There is preferably performance in terms of reducing number of instructions.When parameter γ and ε are set to 1 and 0, proposed compared to Kamal et al. Method, the method for the present invention in performance boost effect advantageously, as a result as shown in Figure 5 (area-constrained is 50%).Work as ginseng When number γ and ε is set to 1 and 0, problem model is translated under the conditions of giving area-constrained, asks improving performance maximized Custom instruction selection.Since the constraint programming method that the present invention uses can be used to find optimal solution, and Kamal et al. is proposed Method cannot be guaranteed that the solution obtained is optimal.Therefore, the method for the present invention becomes apparent from performance boost effect.
When parameter γ and ε are set to 0 and 1, compared to the method that Xiao et al. is proposed, the finger of the method for the present invention selection Enable example number less, as a result as shown in Figure 6.When parameter γ and ε are set to 0 and 1, problem model is translated into given Under the conditions of area-constrained, the Command Example of minimal number is selected to cover former data flow diagram.For each test benchmark program, constraint Programmed method can choose the instruction of minimal number, and the heuristic method that Xiao et al. is proposed is in most cases, cannot look for To the instruction of minimal number.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify to technical solution documented by previous embodiment, or some or all of the technical features are equal Replacement;And these are modified or replaceed, model defined by the claims in the present invention that it does not separate the essence of the corresponding technical solution It encloses.

Claims (3)

1. a kind of custom instruction automatic identifying method based on constraint planning, it is characterised in that: piece including custom instruction Lift selection two parts with custom instruction;
Enumerating for the custom instruction enumerates constraint programming model by establish custom instruction, enumerates from data flow diagram All subgraphs realizations for meeting constraint condition, method particularly includes:
In order to enumerate all custom instructions for meeting given constraint from data flow diagram G (V, E), if subgraph S=(Vs, Es) It is the graph-based of custom instruction example,I1, I2Respectively indicate the active node in figure G set and The set of illegal node,
The data flow diagram G=(V, E) is a directed acyclic graph, nodal set V={ v1, v2..., vMIndicate elementary instruction, M For the number of data flow diagram node, side collectionIndicate that data dependence relation between instruction, m indicate number According to the number on flow graph side;
The given constraint condition includes: the constraint condition that custom instruction does not include illegal node, the company of custom instruction General character constraint condition, custom instruction are the input and output constraint condition of convex constraint condition and custom instruction;
Constraint condition is modeled respectively, and for enumerating problem, using constraint programming method ask it is all meet constraint condition from Custom instruction is enumerated in definition instruction, completion;
The selection of the custom instruction realizes multiple-objection optimization by establishing the selection constraint programming model of custom instruction, Method particularly includes:
On the basis of the subgraph that custom instruction enumeration stage is enumerated, all subgraphs are carried out at isomorphism of graph matching first Reason;
In order to establish the constraint programming model of custom instruction select permeability, first define some variables: N enumerates for custom instruction The number for the candidate custom instruction that stage enumerates, CiIndicate i-th of candidate custom instruction, i=1 ..., N;Customized finger Enable CiThere is n in codeiA example, respectivelyThe execution frequency of the example of each custom instruction is fI, j;It makes by oneself Justice instruction bring processor performance is promoted and custom instruction realizes required hardware area point in custom feature unit P is not usediAnd AiIt indicates;
Shown in the following formula of maximization objective function that then custom instruction bring processor performance is promoted:
Wherein, sI, jFor binary variable, as custom instruction example cI, jIts value is 1 when selection, is otherwise 0;
Since custom instruction is to reduce final instruction fetch and data in register and processor by encapsulating multiple elementary instructions Between the number that transmits, to reduce the energy consumption of processor;The then maximization of custom instruction bring processor energy consumption reduction Shown in the following formula of objective function:
Wherein, E (cI, j) indicate custom instruction example cI, jThe number of internal edges,Expression takes The reduction amount of number of instructions,Indicating the reduction amount of data transmission times, α, β are weight parameter, Alpha+beta=1;
On the basis of the custom instruction preference pattern established above based on objective function, in order to simplify problem, using being based on Multi-objective optimization question is converted to single-object problem by the method for weight, obtains custom instruction shown in following formula Preference pattern:
Wherein, γ, ε are weight parameter ,+ε=1 γ;
Given for user area-constrained, each custom instruction corresponding hardware in custom feature unit has area Size then needs to model the area-constrained of custom instruction, shown in following formula:
Wherein, the area master budget of the given corresponding hardware of all custom instructions, A when A designs for scalable processorsiFor Hardware area corresponding to i-th of custom instruction, SiFor binary variable;If custom instruction CiAt least one example quilt Selection, then SiValue be 1, be otherwise 0, shown in following formula:
2. a kind of custom instruction automatic identifying method based on constraint planning according to claim 1, it is characterised in that: It is described to constraint condition modeling method particularly includes:
The constraint condition for not including illegal node to custom instruction models, shown in following formula:
vsel=0
Wherein, vsel=0 indicates that illegal node v is not included in custom instruction;
The illegal node are as follows: due to the limitation of scalable processors architecture, internal memory operation and branch operation both bases This instruction cannot be included in custom instruction, and the node for representing these elementary instructions is considered as illegal node;
The constraint condition of the connectivity to custom instruction models, shown in following formula:
Wherein,Indicate node v and node vkBetween there are a undirected path, when enumerating separation subgraph, this is about Beam can remove;
The custom instruction is that convex constraint condition is and if only if appointing between any two the node u, v in subgraph s The node in subgraph s is only passed through in what path, models to the constraint condition, shown in following formula:
Wherein, usel, vselRespectively indicate whether node u and v are selected, 0 indicates not selected, and 1 indicates to be selected;
Shown in the following formula of input and output constraint condition of the custom instruction:
Wherein, INmax, OUTmaxRespectively indicate the input and output upper limit of custom instruction, INv, OUTvRespectively indicate entering for node v Degree and out-degree, Pred (u)=v | v ∈ V, (v, u) ∈ E }, Succ (u)=v | v ∈ V, (u, v) ∈ E } respectively indicate node v's Forerunner's node set and subsequent node set, vin、voutRespectively indicate input, the output number of node v, mselIndicating node m is It is no to be selected.
3. a kind of custom instruction automatic identifying method based on constraint planning according to claim 1, it is characterised in that: Described pair of all subgraphs carry out isomorphism of graph matching treatment method particularly includes:
Two subgraphs a and b are given, if a and b isomorphism, creation mode Ci, and be recorded in using subgraph a and b as example Mode CiIn;The mode is the graph-based of candidate custom instruction.
CN201910627531.6A 2019-07-12 2019-07-12 Automatic user-defined instruction identification method based on constraint programming Active CN110333857B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910627531.6A CN110333857B (en) 2019-07-12 2019-07-12 Automatic user-defined instruction identification method based on constraint programming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910627531.6A CN110333857B (en) 2019-07-12 2019-07-12 Automatic user-defined instruction identification method based on constraint programming

Publications (2)

Publication Number Publication Date
CN110333857A true CN110333857A (en) 2019-10-15
CN110333857B CN110333857B (en) 2023-03-14

Family

ID=68146500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910627531.6A Active CN110333857B (en) 2019-07-12 2019-07-12 Automatic user-defined instruction identification method based on constraint programming

Country Status (1)

Country Link
CN (1) CN110333857B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113296788A (en) * 2021-06-10 2021-08-24 上海东软载波微电子有限公司 Instruction scheduling method, apparatus, device, storage medium and program product

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014742A1 (en) * 2001-07-09 2003-01-16 Sasken Communication Technologies Limited Technique for compiling computer code to reduce energy consumption while executing the code
CN102929580A (en) * 2012-11-06 2013-02-13 无锡江南计算技术研究所 Partitioning method and device of digit group multi-reference access
CN103995540A (en) * 2014-05-22 2014-08-20 哈尔滨工业大学 Method for rapidly generating finite time track of hypersonic aircraft
CN105138601A (en) * 2015-08-06 2015-12-09 中国科学院软件研究所 Graph pattern matching method for supporting fuzzy constraint relation
CN105335129A (en) * 2014-06-23 2016-02-17 联想(北京)有限公司 Information processing method and electronic equipment
CN107870780A (en) * 2016-09-28 2018-04-03 华为技术有限公司 Data processing equipment and method
US20180196673A1 (en) * 2015-07-31 2018-07-12 Arm Limited Vector length querying instruction
US20180300148A1 (en) * 2017-04-12 2018-10-18 Arm Limited Apparatus and method for determining a recovery point from which to resume instruction execution following handling of an unexpected change in instruction flow

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014742A1 (en) * 2001-07-09 2003-01-16 Sasken Communication Technologies Limited Technique for compiling computer code to reduce energy consumption while executing the code
CN102929580A (en) * 2012-11-06 2013-02-13 无锡江南计算技术研究所 Partitioning method and device of digit group multi-reference access
CN103995540A (en) * 2014-05-22 2014-08-20 哈尔滨工业大学 Method for rapidly generating finite time track of hypersonic aircraft
CN105335129A (en) * 2014-06-23 2016-02-17 联想(北京)有限公司 Information processing method and electronic equipment
US20180196673A1 (en) * 2015-07-31 2018-07-12 Arm Limited Vector length querying instruction
CN105138601A (en) * 2015-08-06 2015-12-09 中国科学院软件研究所 Graph pattern matching method for supporting fuzzy constraint relation
CN107870780A (en) * 2016-09-28 2018-04-03 华为技术有限公司 Data processing equipment and method
US20180300148A1 (en) * 2017-04-12 2018-10-18 Arm Limited Apparatus and method for determining a recovery point from which to resume instruction execution following handling of an unexpected change in instruction flow

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
B. CHAKRABORTY,等: "Handling Constraints in Multi-Objective GA for Embedded System Design", 《19TH INTERNATIONAL CONFERENCE ON VLSI DESIGN HELD JOINTLY WITH 5TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS DESIGN (VLSID"06)》 *
肖成龙,等: "面向高层次综合的自定义指令自动识别方法", 《计算机应用》 *
龚爱慧,等: "CSPack:采用CSP图匹配的新型装箱算法", 《计算机辅助设计与图形学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113296788A (en) * 2021-06-10 2021-08-24 上海东软载波微电子有限公司 Instruction scheduling method, apparatus, device, storage medium and program product
CN113296788B (en) * 2021-06-10 2024-04-12 上海东软载波微电子有限公司 Instruction scheduling method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110333857B (en) 2023-03-14

Similar Documents

Publication Publication Date Title
Liang et al. Semantic object parsing with graph lstm
Lopez-Novoa et al. A survey of performance modeling and simulation techniques for accelerator-based computing
JP6763072B2 (en) Compile data processing graph
Zheng et al. A fabric defect detection method based on improved yolov5
Chen et al. Tree structured analysis on GPU power study
Hammond et al. Automatic skeletons in template haskell
Zhou et al. Towards the co-design of neural networks and accelerators
Morton et al. Grammar filtering for syntax-guided synthesis
Ney et al. HALF: Holistic auto machine learning for FPGAs
CN110333857A (en) A kind of custom instruction automatic identifying method based on constraint planning
Dutta et al. Pattern-based autotuning of openmp loops using graph neural networks
CN112434785B (en) Distributed parallel deep neural network performance evaluation method for supercomputer
Liang et al. DeGTeC: a deep graph-temporal clustering framework for data-parallel job characterization in data centers
Deniz et al. Using machine learning techniques to detect parallel patterns of multi-threaded applications
CN110377525A (en) A kind of parallel program property-predication system based on feature and machine learning when running
Wang et al. An automatic-addressing architecture with fully serialized access in racetrack memory for energy-efficient CNNs
Pedrycz et al. Using self-organizing maps to analyze object-oriented software measures
Deng et al. Darwin-s: A reference software architecture for brain-inspired computers
Zhou et al. Implementation of hierarchical temporal memory on a many-core architecture
Xiao et al. Parallel custom instruction identification for extensible processors
Mathew et al. A characterization of visual feature recognition
Al-Obaidy et al. Power-Aware Computing on GPGPU Systems Using ML Classification Techniques
Wang et al. Loop Kernel Pipelining Mapping onto Coarse-Grained Reconfigurable Architecture for Data-Intensive Applications.
Zhang et al. A hybrid deep neural network for the prediction of in-vivo protein-DNA binding by combining multiple-instance learning
Koshulko et al. Adaptive parallel implementation of the Combinatorial GMDH algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant