CN108846248A - A kind of application modeling and performance prediction method - Google Patents
A kind of application modeling and performance prediction method Download PDFInfo
- Publication number
- CN108846248A CN108846248A CN201810980603.0A CN201810980603A CN108846248A CN 108846248 A CN108846248 A CN 108846248A CN 201810980603 A CN201810980603 A CN 201810980603A CN 108846248 A CN108846248 A CN 108846248A
- Authority
- CN
- China
- Prior art keywords
- cache
- memory access
- time overhead
- application
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
Abstract
The present invention provides a kind of modeling of application program and performance prediction method, the application modeling method include:Computations and access instruction are obtained from the compiled obtained instruction of application, according to the execution of the architectural feature Modeling Calculation for the machine for running application instruction and access instruction, obtain the time overhead of computations and access instruction;And the regular memory access and/or irregular memory access applied in the memory access stage is modeled according to the architectural feature, obtain the time overhead of regular memory access and/or irregular memory access;And calculate the time overhead in the memory access stage of the application.The present invention can accurately and efficient predict application performance, to help application developer to find using bottleneck and take opposite prioritization scheme.
Description
Technical field
The present invention relates to optimizing application field more particularly to a kind of application modeling and the methods of prediction application performance.
Background technique
With the progress of semiconductor technology, the processor with Multi-Level Cache (multi-level buffer) system has become currently
The mainstream of processor, the increase of caching component is met using the demand to memory access locality in processor.However, due to current
The design of processor becomes increasingly complex and the size of Cache, level are different, when application is operated in different system knots
When on the machine of structure, the speed of service is also different.How accurately to predict to apply the performance on the machine of different architecture,
And optimizing application according to estimated performance is current one of research hotspot.
Currently, having the performance that some models can be used to predict application, such as Roofline model, which can be described
Computation memory access predicts the peak performance of application than the relationship with bandwidth.However, Roofline model does not account for
The level of Cache, thus it is not accurate enough to the prediction of application performance.ECM (the Execution-Cache- that Holger et al. is proposed
Memory) operation of application is divided into two stages of in-core (in core) and out-core (outside core) by model, reflects application
In the calculating in core and the transmission between memory level.However, ECM model do not distinguish Cache at different levels (for example,
L1Cache-L3Cache), missing (Cache Miss) number on every level-one Cache is calculated as it is identical, for there are data
For the application or the lesser application of data scale of reuse, ECM model can not accurately predict the performance of application.
Summary of the invention
To solve above-mentioned problems of the prior art, according to one embodiment of present invention, a kind of application is provided and is built
Mould method, including:Computations and access instruction are obtained from the compiled obtained instruction of application, according to the operation application
The execution of architectural feature the Modeling Calculation instruction and access instruction of machine, show that the time of computations and access instruction opens
Pin;And the regular memory access and/or irregular memory access applied in the memory access stage is modeled according to the architectural feature, it obtains
The time overhead of regular memory access and/or irregular memory access out;And calculate the time overhead in the memory access stage of the application.
In the above method, according to the instruction of the architectural feature Modeling Calculation for the machine for running the application and access instruction
Execution, show that the time overhead of computations and access instruction includes:
According to the architectural feature for the machine for running the application, simulation computations are held in corresponding one or more
Execution on row port and access instruction is simulated in the corresponding one or more execution executed on ports, calculate each execution
The time for each instruction of port;And from the one or more time for each instructions for executing port for executing computations, choosing
Time overhead of the longest time for each instruction as computations is selected, and is executed from the one or more of access instruction are executed
In the time for each instruction of port, select longest time for each instruction as the time overhead of access instruction.
In the above method, the regular memory access applied in the memory access stage is modeled according to the architectural feature, is obtained
The time overhead of regular memory access includes:
Step 1) prefetches strategy according to what the architectural feature obtained Cache at different levels, analyzes the application and is advised
Then data volume involved in memory access;
The missing times that prefetch strategy and the data volume calculate at different levels Caches of the step 2) based on the Cache at different levels;
Step 3) is according to the bandwidth and main memory and highest level between the missing times of Cache at different levels, Cache
The size of bandwidth, Cache Line between Cache calculates between Cache and between main memory and the Cache of highest level
Data transmission period expense;
Data transmission period expense between Cache is added by step 4), and adds the Cache of main memory and highest level
Between data transmission period expense, obtain the time overhead of regular memory access.
In step 3), the data between Cache and between main memory and the Cache of highest level are calculated according to the following formula and are passed
Defeated time overhead:
Ti=Ni*Size (CL)/Bi
Wherein, Ti indicates the i grades Cache and i+1 grades data transmission period expense between Cache or main memory, and Ni indicates i grades
The missing times of Cache, Bi indicate the i grades of Cache and i+1 grades of bandwidth between Cache or main memory, and Size (CL) indicates Cache
The size of Line, i >=1.
In the above method, the irregular memory access applied in the memory access stage is modeled according to the architectural feature, is obtained
The time overhead of irregular memory access includes out:
Step a) constructs Cache simulator according to the architectural feature, and constructs the irregular visit of the application
The memory access sequence deposited;Wherein, the input of the Cache simulator is memory access sequence, and output is the missing time of Cache at different levels
Number.
Step b) obtains Cache simulator described in the memory access sequence inputting of the irregular memory access of the application at different levels
The missing times of Cache;
Step c) is according to the bandwidth and main memory and highest level between the missing times of Cache at different levels, Cache
The size of bandwidth, Cache Line between Cache calculates between Cache and between main memory and the Cache of highest level
Data transmission period expense;
Data transmission period expense between Cache is added by step d), and adds the Cache of main memory and highest level
Between data transmission period expense, obtain the time overhead of irregular memory access.
In the above method, the time overhead for calculating the memory access stage of the application includes:
If irregular memory access is not present in the memory access stage in described apply, using the time overhead of regular memory access as described in
The time overhead in the memory access stage of application;
Apply in the memory access stage if described there are irregular memory access, by the time overhead of the regular memory access with it is described
Time overhead of the sum of the time overhead of irregular memory access as the memory access stage of the application.
According to one embodiment of present invention, a kind of application performance prediction technique is also provided, including:
Application to be predicted is applied to the model that the above method is established, obtains the time of the computations of the application
The time overhead of expense, the time overhead of access instruction and memory access stage;
The time overhead of the access instruction and the time overhead in the memory access stage are summed;
Summed result is compared with the time overhead of the computations, using the greater therein as the application
Expected time.
The processing stage and memory access stage that the architectural feature of present invention combination machine (computer) respectively corresponds
Modeling.Wherein, the processing stage of application is further divided into the time overhead of computations and the time overhead of access instruction;
And the memory access stage is further divided into the time overhead of regular memory access and the time overhead of irregular memory access, while being distinguished each
Grade Cache.The present invention can be more accurate and efficient predicts application performance, answers to be conducive to help application developer and find
With bottleneck and take opposite prioritization scheme.
Detailed description of the invention
Embodiments of the present invention is further illustrated referring to the drawings.
Fig. 1 is the flow chart of application modeling according to an embodiment of the invention and performance prediction method;
Fig. 2 is the schematic diagram according to an embodiment of the invention using modeling method;
Fig. 3 is the flow chart of the memory access stage modeling method of application according to an embodiment of the invention;
Fig. 4 is Cache simulator internal connection methods and Cache according to an embodiment of the invention and main memory and post
The schematic diagram of the connection method of storage.
Specific embodiment
In order to make the purpose of the present invention, technical solution and advantage are more clearly understood, and are passed through below in conjunction with attached drawing specific real
Applying example, the present invention is described in more detail.It should be appreciated that described herein, specific examples are only used to explain the present invention, and
It is not used in the restriction present invention.
The present invention is applied in the operational process on machine and is divided into processing stage and memory access stage, and wherein processing stage relates to
And execution of the compiled obtained instruction of application in processor core, and the memory access stage is related to using between cachings at different levels when running
And the data between main memory and highest level caching are transmitted.The present invention by processing stage be further divided into computations when
Between expense and access instruction time overhead;The memory access stage on the one hand be divided into regular memory access time overhead and irregular memory access
Time overhead, the data transmission period expense and main memory being on the other hand divided between cachings at different levels and highest level it is slow
Data transmission period expense between depositing.
According to one embodiment of present invention, a kind of application modeling and performance prediction method are provided, referring to Fig. 1, this method
It is divided into using modelling phase and performance prediction stage.Wherein, using the body that the modelling phase includes in conjunction with (operation application) machine
Architecture feature models application, and each time overhead being applied;The performance prediction stage includes calculating application
The execution time on machine.The two stages are unfolded to be described in detail below in conjunction with attached drawing.
One, applies the modelling phase
It include processing stage modeling and the modeling of memory access stage using the modelling phase, i.e., to (being obtained using compiled) instruction
Execution in processor core carries out modeling and models to the transmission of application runtime data.It is retouched respectively referring now to Fig. 2
State processing stage modeling and the modeling of memory access stage.
Processing stage modeling
As described above, processing stage is related to using execution of the compiled obtained instruction in processor core.Processing stage
Modeling models the execution of application compiled obtained computations and access instruction, and the time for generating two kinds of instructions opens
Pin.The modeling of processing stage includes the following steps:
1. obtaining computations and access instruction from the compiled obtained instruction of application, detection will run the application
The architectural feature of machine, the execution based on architectural feature simulation computations and access instruction.Specifically, according to finger
The out-of-order scheduling mechanism and assembly line mechanism of order, simulate computations and access instruction executes port accordingly in the processor
It is upper to execute (it should be noted that the instruction set that processor uses includes but is not limited to X86, MIPS or ARM etc., and processor
The quantity of core be also possible to one be also possible to it is multiple).There are multiple execution ports in processor, each execution port is processing rank
The execution unit of Duan Jianmo executes corresponding instruction for dispatching, with realize to L1Cache (1 grade of Cache, such as
L1Dcache the operation of the data in).Wherein, computations are executing port for executing the one or more of computations
It executes, and access instruction executes on port in one or more execute for executing access instruction.
2. calculating the time for each instruction on each execution port, executed from for executing the one or more of computations
In the time for each instruction of port, select longest time for each instruction as the time overhead of computations.From for executing
In one or more time for each instructions for executing port of access instruction, longest time for each instruction is selected to refer to as memory access
The time overhead of order.
The modeling of memory access stage
The memory access stage be related to applying when being run on machine between caching at different levels and main memory and the caching of highest level it
Between data transmission.According to one embodiment of present invention, memory access divided stages are to the regular memory access of data and to data
Irregular memory access, the modeling of memory access stage is related to carrying out regular memory access modeling and modeling irregular memory access, thus
Obtain the time overhead in memory access stage.Referring to Fig. 3, the modeling in memory access stage includes the following steps:
The source code of step 310. analysis application obtains the data that access instruction needs to access, and is accessed according to the needs
Data judgement is applied in the memory access stage with the presence or absence of irregular memory access mode (referred to as irregular memory access), if there is irregular
Memory access then enters step 320, otherwise enters step 350.Wherein, the data address if necessary to access is continuous or span
It is smaller, regard regular memory access (also known as continuous memory access) as, the data address if necessary to access is random or span is larger,
Such as span is more than the size of a Cache Line (the minimal cache unit in Cache i.e. at different levels), then regards irregular visit as
Deposit (also known as discontinuous memory access, random memory access).
Step 320. constructs Cache simulator according to the architectural feature for the machine that will run the application, should
The input of Cache simulator is memory access sequence and exports missing (Cache Miss) number for being Cache at different levels.
Specifically, detection machine obtains architectural feature, including:The size of the level of Cache, Cache at different levels,
Cache group is connected strategy, Cache swapping in and out strategy etc., constructs Cache simulator based on the architectural feature.Firstly, structure
Cache level identical with physical machine is built, the size and its Cache Line piecemeal of Cache at different levels are set, and to each
A Cache Line piecemeal is numbered;Secondly, constructing connection type between Cache at different levels and main memory and afterbody
The connection type of (highest level) Cache, referring to fig. 4.After the memory access sequence inputting Cache simulator, Cache simulator
Corresponding data are searched (wherein, if searching missing, by data from main memory/caching (such as L2Cache based on the memory access sequence
Or L3Cache etc.) it is successively transferred to L1Cache, for the use of register access instruction), according to Cache swapping in and out plans at different levels
Analogue data is omited in the swapping in and out of Cache at different levels, and Cache at different levels are obtained according to the swapping in and out number of Cache at different levels
Cache missing times as output.
The memory access sequence of the irregular memory access of step 330. building application.
Analysis application obtains the address of the access data of irregular memory access, and the size building based on Cache Line is irregular
The memory access sequence of memory access.The address sequence of the memory access sequence, that is, irregular memory access access data, each of which element are one corresponding
Cache Line piecemeal number.
Step 340. models irregular memory access, i.e., by the memory access sequence inputting Cache simulator of irregular memory access, exports
The Cache missing times of Cache at different levels corresponding to irregular memory access, subsequently into step 350.
Step 350. modeling rule memory access obtains the Cache missing times for corresponding to the Cache at different levels of regular memory access, into
Enter step 360.
Specifically, Cache at different levels are obtained according to the architectural feature for the machine that will run the application and prefetches strategy, and
And analysis application obtains the data volume of access data involved in regular memory access, prefetches strategy and rule based on Cache at different levels
The data volume that memory access is related to calculates the Cache missing times of Cache at different levels.
Step 360. calculates the time overhead in memory access stage, the time overhead including regular memory access, if there is irregular
Memory access then further includes the time overhead of irregular memory access.The time overhead of computation rule memory access includes:
1. according to the bandwidth and main memory and highest that correspond between the Cache missing times of regular memory access, Cache at different levels
The Cache Line size of bandwidth, Cache at different levels between the Cache of rank calculates between Cache, main memory and highest level
Cache between data transmission period expense, calculation formula is as follows:
Ti=Ni*Size (CL)/Bi (1)
Wherein, Ti indicates the data between i grades of Cache and i+1 grades of Cache (being then main memory as not having i+1 grades of Cache)
Transmission time expense, Ni indicate the Cache missing times of i grades of Cache, and Bi indicates i grades of Cache and i+1 grades of Cache (as not having i
+ 1 grade of Cache is then main memory) between bandwidth, Size (CL) indicates the size of the Cache Line of Cache at different levels, such as can
To be 64Byte;Wherein i >=1.
2. the data transmission period expense between Cache at different levels is added, and add the Cache of main memory and highest level
Between data transmission period expense, obtain the time overhead of regular memory access.
The time overhead for calculating irregular memory access is similar with the time overhead of computation rule memory access.
If the memory access stage of application only includes regular memory access, using the time overhead of regular memory access as the memory access stage
Time overhead;If the memory access stage of application had both included regular memory access or including irregular memory access, the time in memory access stage is opened
Pin is the time overhead of regular memory access and the sum of the time overhead of irregular memory access.
Two, performance prediction stages
In this stage, apply according in each time overhead calculating obtained using the modelling phase in the execution on machine
Between, including:
1. the time overhead of the access instruction of application and the time overhead in memory access stage are summed;
2. by step 1. in summed result be compared with the time overhead of computations, using the greater therein as
The expected time of application.
The present invention models the processing stage of application and memory access stage respectively, and compared with Roofline model, the present invention is simultaneously
The non-UPS upper performance score that application is predicted using only peak value, but it is pre- to provide an accurate performance for machine architecture
Phase;Compared with ECM model, the present invention has separated the indifference Cache of ECM, realizes to the pre- of Miss number of Multi-Level Cache
It surveys;In addition, the memory access divided stages of application are regular memory access and irregular memory access by the present invention, and construct a Cache mould
Quasi- device transmits to simulate the data of irregular memory access, being capable of more accurate and efficient prediction data transmission time overhead.
It should be noted that rule can also be modeled using Cache simulator above in the case where not considering efficiency
Then memory access.
It should be noted that some illustrative methods are depicted as flow chart.It is executed although operation is expressed as sequence by flow chart,
But it is understood that many operations can be parallel while or synchronously being executed.Furthermore it is possible to rearrange the sequence of operation.
Processing can be terminated when operating and completing, but also be can have and be not included in the other step in figure or in embodiment.
It should be understood that the exemplary embodiment of software realization usually carried out in some form of program storage medium coding or
Person realizes on some type of transmission medium.Program storage medium can be arbitrary non-transitory storage media, such as disk
(for example, floppy disk or hard disk) or CD (for example, compact disk read-only memory or " CD ROM "), and can be it is read-only or
Random access.Similarly, transmission medium can be twisted pair, coaxial cable, optical fiber or known in the art some other
Applicable transmission medium.
Although the present invention has been described by means of preferred embodiments, the present invention is not limited to described here
Embodiment, without departing from the present invention further include made various changes and variation.
Claims (8)
1. it is a kind of using modeling method, including:
Computations and access instruction are obtained from the compiled obtained instruction of application, according to the body for the machine for running the application
The execution of architecture feature modeling computations and access instruction obtains the time overhead of computations and access instruction;And
The regular memory access and/or irregular memory access applied in the memory access stage is modeled according to the architectural feature, is obtained
The time overhead of regular memory access and/or irregular memory access;And calculate the time overhead in the memory access stage of the application.
2. according to the method described in claim 1, wherein, modeling meter according to the architectural feature for the machine for running the application
Instruction and the execution of access instruction are calculated, show that the time overhead of computations and access instruction includes:
According to the architectural feature for the machine for running the application, simulation computations are in corresponding one or more actuating stations
Execution on mouth and access instruction is simulated in the corresponding one or more execution executed on ports, calculate each execution port
Time for each instruction;And
From the one or more time for each instructions for executing port for executing computations, longest time for each instruction is selected
As the time overhead of computations, and the time for each instructions from the one or more execution ports for executing access instruction
In, select longest time for each instruction as the time overhead of access instruction.
3. according to the method described in claim 1, wherein, applying according to architectural feature modeling is described in the memory access stage
Regular memory access, show that the time overhead of regular memory access includes:
Step 1) prefetches strategy according to what the architectural feature obtained Cache at different levels, analyzes the application and obtains rule and visits
Deposit related data volume;
The missing times that prefetch strategy and the data volume calculate at different levels Caches of the step 2) based on the Cache at different levels;
Step 3) according between the missing times of Cache at different levels, Cache bandwidth and main memory and the Cache of highest level it
Between bandwidth, the size of Cache Line, calculate between Cache and data between main memory and the Cache of highest level pass
Defeated time overhead;
Data transmission period expense between Cache is added by step 4), and plus between main memory and the Cache of highest level
Data transmission period expense, obtain the time overhead of regular memory access.
4. according to the method described in claim 3, in step 3), calculated between Cache according to the following formula and main memory and highest
Data transmission period expense between the Cache of rank:
Ti=Ni*Size (CL)/Bi
Wherein, Ti indicates the i grades Cache and i+1 grades data transmission period expense between Cache or main memory, and Ni indicates i grades
The missing times of Cache, Bi indicate the i grades of Cache and i+1 grades of bandwidth between Cache or main memory, and Size (CL) indicates Cache
The size of Line, i >=1.
5. according to right want 1 described in method, wherein described apply in the memory access stage is modeled according to the architectural feature
Irregular memory access show that the time overhead of irregular memory access includes:
Step a) constructs Cache simulator according to the architectural feature, and construct the irregular memory access of the application
Memory access sequence;Wherein, the input of the Cache simulator is memory access sequence, and output is the missing times of Cache at different levels.
Cache simulator described in the memory access sequence inputting of the irregular memory access of the application is obtained Cache's at different levels by step b)
Missing times;
Step c) according between the missing times of Cache at different levels, Cache bandwidth and main memory and the Cache of highest level it
Between bandwidth, the size of Cache Line, calculate between Cache and data between main memory and the Cache of highest level pass
Defeated time overhead;
Data transmission period expense between Cache is added by step d), and plus between main memory and the Cache of highest level
Data transmission period expense, obtain the time overhead of irregular memory access.
6. according to the method described in claim 1, wherein, the time overhead for calculating the memory access stage of the application includes:
If irregular memory access is not present in the memory access stage in described apply, using the time overhead of regular memory access as the application
The memory access stage time overhead;
If described apply in the memory access stage there are irregular memory access, by the time overhead of the regular memory access and the non-rule
Then time overhead of the sum of the time overhead of memory access as the memory access stage of the application.
7. a kind of application performance prediction technique, including:
Application to be predicted is applied to the model that method of any of claims 1-6 is established, obtains described answer
The time overhead of the time overhead of computations, the time overhead of access instruction and memory access stage;
The time overhead of the access instruction and the time overhead in the memory access stage are summed;
Summed result is compared with the time overhead of the computations, using the greater therein as the pre- of the application
Meter executes the time.
8. a kind of calculating equipment, including processor and memory, the memory are stored with instruction, when the processor executes institute
The calculating equipment is made to execute such as method of any of claims 1-7 when stating instruction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810980603.0A CN108846248B (en) | 2018-08-27 | 2018-08-27 | Application modeling and performance prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810980603.0A CN108846248B (en) | 2018-08-27 | 2018-08-27 | Application modeling and performance prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108846248A true CN108846248A (en) | 2018-11-20 |
CN108846248B CN108846248B (en) | 2020-07-31 |
Family
ID=64188608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810980603.0A Active CN108846248B (en) | 2018-08-27 | 2018-08-27 | Application modeling and performance prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108846248B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114265593A (en) * | 2021-12-09 | 2022-04-01 | 北京奕斯伟计算技术有限公司 | Instruction scheduling method, device, equipment and computer readable storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060179429A1 (en) * | 2004-01-22 | 2006-08-10 | University Of Washington | Building a wavecache |
CN103605833A (en) * | 2013-10-30 | 2014-02-26 | 华为数字技术(苏州)有限公司 | Method and device for simulating performance of storage array system |
-
2018
- 2018-08-27 CN CN201810980603.0A patent/CN108846248B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060179429A1 (en) * | 2004-01-22 | 2006-08-10 | University Of Washington | Building a wavecache |
CN103605833A (en) * | 2013-10-30 | 2014-02-26 | 华为数字技术(苏州)有限公司 | Method and device for simulating performance of storage array system |
Non-Patent Citations (3)
Title |
---|
RAIMUND KIRNER;PETER PUSCHNER: "Time-Predictable Task Preemption for Real-Time Systems with Direct-Mapped Instruction Cache", 《10TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT AND COMPONENT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING (ISORC"07)》 * |
房振满: "多核缓存系统优化及评测研究", 《中国博士学位论文全文数据库(电子期刊)信息科技辑》 * |
马可: "微处理器性能分析模型的建立和研究", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114265593A (en) * | 2021-12-09 | 2022-04-01 | 北京奕斯伟计算技术有限公司 | Instruction scheduling method, device, equipment and computer readable storage medium |
CN114265593B (en) * | 2021-12-09 | 2022-11-22 | 北京奕斯伟计算技术股份有限公司 | Instruction scheduling method, device, equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108846248B (en) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11663125B2 (en) | Cache configuration performance estimation | |
Ascia et al. | Efficient design space exploration for application specific systems-on-a-chip | |
JP2022511491A (en) | Generation of integrated circuit floor plans using neural networks | |
CN112352219B (en) | System and method for automated compilation | |
CN108122032A (en) | A kind of neural network model training method, device, chip and system | |
CN116702850A (en) | Method, system, article of manufacture, and apparatus for mapping workloads | |
JP4791959B2 (en) | Block modeling I / O buffer | |
CN114172820B (en) | Cross-domain SFC dynamic deployment method, device, computer equipment and storage medium | |
CN112771554A (en) | Predictive variables in programming | |
CN112764893B (en) | Data processing method and data processing system | |
JP3608915B2 (en) | Multiprocessing system performance evaluation method and apparatus, and storage medium storing multiprocessing system performance evaluation program | |
CN109067583A (en) | A kind of resource prediction method and system based on edge calculations | |
Sohrabizadeh et al. | Automated accelerator optimization aided by graph neural networks | |
CN108804391A (en) | A kind of building method and system of interpolation curve or curved surface based on B-spline | |
EP3805995A1 (en) | Method of and apparatus for processing data of a deep neural network | |
CN108846248A (en) | A kind of application modeling and performance prediction method | |
Franssen et al. | Control flow optimization for fast system simulation and storage minimization/spl lsqb/real-time multidimensional signal processing/spl rsqb | |
Zhou et al. | Makespan–cost–reliability-optimized workflow scheduling using evolutionary techniques in clouds | |
CN110109702B (en) | Android computing migration online decision-making method based on code analysis | |
CN105787265A (en) | Atomic spinning top random error modeling method based on comprehensive integration weighting method | |
Cicirelli et al. | Analyzing stochastic reward nets by model checking and parallel simulation | |
Jünger et al. | Amaix: a generic analytical model for deep learning accelerators | |
Hu et al. | Optimizing resource allocation for data-parallel jobs via gcn-based prediction | |
CN114995818A (en) | Method for automatically configuring optimized parameters from Simulink model to C language | |
Pan et al. | Asynchronous value iteration network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |