CN115129393A

CN115129393A - Application configuration determining method and device, electronic equipment and storage medium

Info

Publication number: CN115129393A
Application number: CN202210791450.1A
Authority: CN
Inventors: 孔庆凯; 王清坤; 贾耀仓; 陈维伟
Original assignee: Beijing Zhongke Haixin Technology Co ltd
Current assignee: Beijing Zhongke Haixin Technology Co ltd
Priority date: 2022-07-06
Filing date: 2022-07-06
Publication date: 2022-09-30
Anticipated expiration: 2042-07-06
Also published as: CN115129393B

Abstract

The invention discloses an application configuration determining method, an application configuration determining device, electronic equipment and a storage medium, and relates to the technical field of reconfigurable processors, wherein the method comprises the steps of sampling configuration information of a reconfigurable processor in a continuous distribution sampling mode to obtain continuous distribution samples and distribution values thereof; determining a performance prediction result and a control variable of the continuous distribution sample based on a proxy model, and determining a loss value based on the performance prediction result and a target performance of the continuous distribution sample; and determining the configuration information to be application configuration by adopting a gradient descent method based on the loss value, the control variable and the distribution value of the continuous distribution sample.

Description

Application configuration determining method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of reconfigurable processor technologies, and in particular, to a method and an apparatus for determining an application configuration, an electronic device, and a storage medium.

Background

The reconfigurable chip is a novel chip architecture technology, has the characteristics of software and hardware dual programming, hardware function change along with software change and the like, and has flexibility comparable to that of a CPU (central processing unit) and an FPGA (field programmable gate array). The reconfigurable chip utilizes the reconfigurable technology to provide the capability of customizing the chip architecture according to software during operation, and a logic channel and a storage system of the chip, so that the balance between high energy efficiency and high flexibility is achieved.

At present, there are two main methods for selecting the hyper-parameters of the optimal configuration information according to the application characteristics: the hyper-parameter design based on heuristic search and the hyper-parameter design based on the proxy model have problems, and the hyper-parameter design based on heuristic search has low sampling efficiency, large evaluation variance and large search overhead of the hyper-parameter design; the hyper-parameter design based on the agent model requires manual training and adjustment, and the manual cost is large.

Disclosure of Invention

The invention aims to provide an application configuration determining method, an application configuration determining device, electronic equipment and a storage medium, so that the sampling efficiency is improved, and the evaluation variance and the searching process overhead are reduced.

In a first aspect, a method for determining application configuration provided by the present invention includes:

sampling configuration information of a reconfigurable processor by adopting a continuous distribution sampling mode to obtain continuous distribution samples of the configuration information and distribution values of the continuous distribution samples;

determining a performance prediction result of the continuously distributed samples based on a proxy model;

determining a control variable based on the performance prediction result;

determining a loss value based on actual performance and target performance of the configuration information sample;

determining a gradient of the configuration information based on the loss value, the control variable, and a distribution value of the consecutive distribution samples;

and if the gradient of the configuration information meets the iteration triggering condition, updating the configuration information by adopting a back propagation algorithm, and if the gradient of the configuration information meets the iteration termination condition, determining the configuration information to be application configuration based on the gradient of the configuration information.

Compared with the prior art, in the application configuration determining method of the reconfigurable processor, the performance prediction result of the continuous distribution samples is determined based on the proxy model, the estimation of the performance prediction result is used as the control variable, so that the control variable is closely related to the loss function, and the control variable is introduced into the gradient calculation of the configuration information, thereby reducing the sampling variance, improving the search efficiency and reducing the cost of the search process. The configuration information of the reconfigurable processor is sampled in a continuous distribution sampling mode, and the non-differentiable discrete variables contained in the configuration information can be approximated to be continuously distributed, so that the performance prediction result determined based on the proxy model can comprise the performance prediction result of the micro sample or the performance prediction result of the non-micro sample. At the moment, when the configuration information is updated by adopting a gradient descent method, when the gradient of the configuration information meets the iteration triggering condition, the configuration information is updated by adopting a back propagation algorithm, so that not only differentiable variables but also non-differentiable variables can be updated, and the searching accuracy is ensured. And a control variable method is adopted, so that the control variable is closely related to the loss function, the variance is reduced, the sampling efficiency is improved, and the searching quality is improved.

In a second aspect, the present invention further provides an application configuration determining apparatus, including:

the sampling module is used for sampling the configuration information of the reconfigurable processor by adopting a continuous distribution sampling mode to obtain continuous distribution samples of the configuration information and distribution values of the configuration information;

a determining module, configured to determine a performance prediction result of the continuously distributed sample based on a proxy model, determine a control variable based on the performance prediction result, determine a loss value based on an actual performance and a target performance of the configuration information sample, and determine a gradient of the configuration information based on the loss value, the control variable, and a distribution value of the continuously distributed sample;

and the updating module is used for updating the configuration information by adopting a back propagation algorithm if the gradient of the configuration information meets an iteration triggering condition, and determining the configuration information to be application configuration based on the gradient of the configuration information if the gradient of the configuration information meets an iteration termination condition.

Compared with the prior art, the beneficial effects of the application configuration determining device provided by the invention are the same as the beneficial effects of the application configuration determining method in the technical scheme, and the detailed description is omitted here.

The invention also provides an electronic device comprising a memory and a processor, wherein the memory is used for storing computer instructions, and the computer instructions are executed by the processor to realize the method according to the technical scheme of the invention.

Compared with the prior art, the beneficial effects of the electronic device provided by the invention are the same as the beneficial effects of the application configuration determining method in the technical scheme, and the details are not repeated here.

The present invention also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method according to the present invention.

Compared with the prior art, the beneficial effects of the non-transitory computer readable storage medium provided by the invention are the same as the beneficial effects of the application configuration determining method in the technical scheme, and the details are not repeated here.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart illustrating a method for configuring hyper-parameters of a reconfigurable processor based on heuristic search in the prior art;

FIG. 2 is a flowchart illustrating a search process for hyper-parameter configuration of a reconfigurable processor based on a proxy model in the prior art;

fig. 3 is a flowchart of a method for determining application configuration according to an embodiment of the present invention;

FIG. 4 is a block diagram of an exemplary application configuration determination apparatus;

FIG. 5 is a schematic block diagram of a chip of an embodiment of the invention;

fig. 6 is a block diagram of an electronic device of a server or a client according to an embodiment of the present invention.

Reference numerals are as follows:

101-application configuration parameter space, 102-reconfigurable processor simulation configuration environment, 1021-first memory, 1022-second memory, 103-heuristic search controller, 104-configuration information;

201-application configuration parameter space, 202-reconfigurable processor performance model, 203-training pair, 204-agent model;

401-sampling module, 402-determining module, 403-updating module;

501-processor, 502-communication interface, 503-memory, 504-bus system;

601-calculation unit, 602-ROM, 603-RAM, 604-bus, 605-I/O interface, 606-input unit, 607-output unit, 608-storage unit, 609-communication unit.

Detailed Description

In order to facilitate clear description of technical solutions of the embodiments of the present invention, in the embodiments of the present invention, words such as "first" and "second" are used to distinguish identical items or similar items with substantially the same functions and actions. For example, the first threshold and the second threshold are only used for distinguishing different thresholds, and the sequence order of the thresholds is not limited. Those skilled in the art will appreciate that the terms "first," "second," and the like do not denote any order or importance, but rather the terms "first," "second," and the like do not denote any order or importance.

It is to be understood that the terms "exemplary" or "such as" are used herein to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

In the present invention, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a and b combination, a and c combination, b and c combination, or a, b and c combination, wherein a, b and c can be single or multiple.

The reconfigurable processor architecture can comprise a reconfigurable data path and a reconfigurable controller, the location of the reconfigurable data path is similar to an Arithmetic and Logic Unit (ALU) of a general processor, but the reconfigurable data path is usually in an array structure composed of many basic processing units (PE); the function of each PE is similar to that of an ALU, and the forwarding of intermediate calculation results among the PEs is realized through a flexible internet, so that the customization of a data path is realized by customizing configuration information, and the reconfigurable processor has the effects of high energy efficiency and high flexibility.

In practical application, the reconfigurable processor has a large amount of reconfigurable logic, the setting of configuration information reflects the spatial domain mapping characteristic of reconfigurable computation, the reconfigurable processor not only can express the operation of the processing units, but also can express the flow of data among the processing units through the interconnection function of an explicit setting chip, so that the reconfigurable processor meets the application requirements. In the configuration information, a large number of super parameters such as chip operation frequency, interconnection mode and the like exist, which determine the chip operation efficiency of the reconfigurable processor and also bring challenges, namely how to select the super parameters of the optimal configuration information of the reconfigurable processor according to application characteristics.

For example: based on exploration and analysis of data and experience of human experts in hyper-parameter debugging, different configuration information parameters can be selected according to specific applications, and hyper-parameter configuration with optimal performance is selected through multiple tests and comparative analysis. The method is extremely dependent on professional knowledge and hyper-parameter debugging experience of reconfigurable chip design experts, the adjustment difficulty is high, multiple tests are time-consuming and labor-consuming, and a great amount of agile requirements of reconfigurable architecture applied in the era of interconnection of everything are difficult to meet efficiently and timely. In order to solve the problem of time and labor consumption in the process of setting the over-parameters of the reconfigurable chip configuration information and reduce the manual design overhead, the following two methods are mainly used: hyper-parameter design based on heuristic search and hyper-parameter design based on proxy model.

Fig. 1 shows a flowchart of a prior art heuristic search based hyper-parameter configuration of a reconfigurable processor. As shown in fig. 1, the procedure for configuring hyper-parameters of a reconfigurable processor based on heuristic search is as follows: firstly, an application configuration parameter space 101 and a reconfigurable processor design simulation environment 102 are constructed according to application, a heuristic search controller 103 is set, the heuristic search controller 103 obtains configuration information 104 meeting constraint conditions by using initial parameters and constraint parameters of a genetic algorithm, then the configuration information 104 and clock signals are input into a first memory 1021, and the first memory 1021 controls and configures a second memory 1022 according to the input configuration information 104 under the control of the clock signals, so that circuits in the second memory 1022 are reconfigured; meanwhile, the first memory 1021 also inputs the test signal to the second memory 1022, and the second memory 1022 processes the test signal by using the reconstructed circuit and outputs the test signal. In this process, the first memory 1021 may detect the status of the reconstructed circuit, so as to obtain the performance data of the second memory 1022, and then the first memory 1021 returns the performance data of the second memory 1022 to the heuristic search controller 103. The heuristic search controller 103 may update the configuration information 104 based on the performance data of the second memory 1022 until the performance data of the second memory 1022 meets the application requirements.

However, in the heuristic search-based hyper-parameter design method, the sampling efficiency is low, the evaluation variance is too large, the search overhead of the hyper-parameter design is large, and a better configuration scheme can be obtained only after days or even months, which is not acceptable for practical industrial application.

Fig. 2 shows a flowchart of a search for hyper-parameter configuration of a reconfigurable processor based on a proxy model in the prior art. The search flow of the hyper-parameter configuration of the reconfigurable processor based on the proxy model shown in fig. 2 is as follows, firstly, an application configuration parameter space 201 is constructed according to the application characteristics, and then a large number of sample pairs (Z, a) are sampled from the application configuration parameter space 201, wherein Z represents a configuration sample, and a represents an application. Sending the sampled samples into a reconfigurable processor performance model 202, obtaining real performance data C corresponding to the sample pair (Z, A), further constructing a training pair (Z, A, C)203 consisting of a large number of configuration samples, application and performance data, then manually constructing a proxy model 204, realizing rapid prediction of the design pair (Z, A), obtaining performance prediction data P, then determining the performance prediction data P and a loss value of the corresponding real performance data C by using a loss function, and when the loss value does not accord with a training termination condition, updating parameter values of the proxy model 204 by using a back propagation algorithm until the loss value accords with a training termination condition. When the training of the agent model 204 is completed, the parameter values of the agent model 204 may be fixed, the performance function corresponding to the configuration information is predicted by the agent model 204, and the configuration information is optimized by using a gradient descent method, so as to gradually find the optimal performance configuration Zopt.

However, it is not easy to accurately model the performance of the reconfigurable processor by using the agent model, and it is necessary to manually construct an agent model function and manually train and adjust the function, and since the deployment application influence factors and the types of configuration information are many, the hardware performance indexes are greatly affected, so that it is difficult to accurately predict the hardware performance by using the agent model, and further, the collaborative search completely fails.

Inventor(s):it is found that, in the research process of setting the superior parameters of the reconfigurable processor, the sampling efficiency is low based on a heuristic search method, and when the proxy model is adopted to optimize the superior parameters of the reconfigurable processor, the method can be formally expressed as: how to determine the configuration parameters Z contained in the configuration information Z according to the application constraint, i.e. the performance C, so as to minimize the design loss function

For some application constraint A, assume that Z is determined by a parameter φ (probability of generation of configuration information) forming a parameter distribution, p _φ (Z) represents the distribution of the configuration information Z determined by the parameter φ (the probability of generation of the configuration information), the loss function

Expectation of function can be lost by conversion to learning parameter phi (generation probability of configuration information)

In the actual optimization procedure, a loss function

Usually a combination of an undifferentiated function and a differentiable target, the combination of functions of an undifferentiated target such as chip area, delay or power consumption, the loss function being determined according to a specific configuration information Z

It is usually time-consuming, such as determining the power consumption of a chip in a given configuration requires periodic accurate simulation, and usually takes several hours or even days to determine, which is one of the main reasons for the long period of the heuristic search method. Proxy function method replaces by constructing a differentiable function

And f (z) a relative parameter phi (of configuration information)Probability generation) can be derived, thereby speeding up the search, but an accurate, differentiable function f (z) is constructed instead of

The method and the device for determining the application configuration are provided according to the concept of the Relax estimator from the viewpoint of improving sampling efficiency because the method and the device are complex, and in some cases, for example, the coding option is changed, so that the prediction of the substitute function f (z) is invalid, and the search is invalid.

Fig. 3 is a step diagram illustrating an application configuration determining method according to an embodiment of the present invention. As shown in fig. 3, the method for determining application configuration according to the embodiment of the present invention includes:

step 301: and sampling the configuration information of the reconfigurable processor by adopting a continuous distribution sampling mode to obtain continuous distribution samples of the configuration information and distribution values of the configuration information.

For example, the reconfigurable processor may sample from a configuration parameter space, which may be represented by a discrete vector z ═ z ¹ ，z ² ，...，z ^k ]It is shown that,

represents the selection of the ith parameter, and m is the size of the selectable number of the ith parameter. To facilitate neural network processing, Z ⁱ Expressed as one-hot (one-hot) codes, i.e.

Such as when Z ⁱ When representing the selection of processor frequencies in a reconfigurable processor, it is assumed that there are 4 choices of 1Ghz, 1.5Ghz, 2Ghz and 3Ghz, where m is 4, and when 1Ghz is selected, i.e., z ⁱ ＝[1，0，0，0]When 3Ghz is selected, i.e. z ⁱ ＝[0，0，0，1]Such a representation may cause a particular reconfigurable processor configuration parameter to be recovered from a given configuration information Z.

Illustratively, the continuous distributed sampling mode in the embodiment of the present invention is a Gumbel-Softmax sampling mode, and Gumbel is adoptedSampling the configuration parameter space of the reconfigurable processor under different applications in a Softmax mode to obtain continuous distribution samples GS (Zeta | z) of the configuration information and distribution values of the continuous distribution samples

The Gumbel-Softmax sampling mode can approximate the non-differentiable discrete variable contained in the configuration information Z to a continuously distributed variable, so that the non-differentiable discrete variable becomes a differentiable variable.

For example: the performance parameters of the reconfigurable processor, such as chip area, delay or power consumption, are discrete variables and are not differentiable, and for the configuration information of the performance parameters, the proxy model cannot be accurately optimized, so that the Gumbel-Softmax sampling mode can be adopted to continuously sample the configuration information Z of the performance parameters, and thus the non-differentiable parameters, such as chip area, delay or power consumption, are changed into differentiable parameters.

Step 302: performance prediction results for the continuously distributed samples are determined based on the proxy model.

In the embodiment of the present invention, the performance of the continuous distribution samples is predicted based on the proxy model, and a performance prediction result m (ζ) of the continuous distribution samples may be generated.

The agent model m (-) can be a trained agent model, and the agent model can be built by relying on the existing MLP building library. In practical application, a reconfigurable simulator based on heuristic search or a reconfigurable simulator based on a proxy model can be adopted to sample specific performance sample data of different applications in different configurations of a reconfigurable processor, configuration information Z is constructed, application A and actual performance C are constructed to be a training sample library, and then the proxy model m (-) constructed by the existing MLP (multi-layer perceptron) function library is adopted for training. Specifically, the input of the proxy model m (-) is the configuration information Z and the application a, and the output is the performance prediction result corresponding to the configuration information Z, such as performance data such as area, power consumption and the like, which are not trivial, and cost data, and the prediction accuracy evaluation index of the proxy model m (-) can be a square root error.

Because the configuration information Z of the reconfigurable processor is sampled in a continuous distribution sampling mode, the performance of an immutable parameter can be predicted through the proxy model m (·), and the relative parameter phi (the generation probability of the configuration information) of a prediction result can be further searched for the optimal application configuration by adopting a gradient descent method.

Step 303: determining a control variable based on the performance prediction result.

To reduce the variance, a control variable may be introduced, which may be an estimate of the performance prediction result of the continuously distributed samples, in particular the expectation of the performance prediction result of the continuously distributed samples, the control variable c (z) and the loss function

Closely related, the control variable c (z) specifically satisfies formula 1:

wherein C (Z) is a control variable, Z is configuration information, GS (Zeta | Z) is a continuous distribution sample generated by the configuration information by a Gumbel-Softmax sampling method,

for continuously sampled continuous distribution values of configuration information generated using the Gumbel-Softmax sampling method,

to generate the expectation of a continuous distribution of configuration information Z using the Gumbel-Softmax sampling method,

the results are predicted for the performance of the continuously distributed samples. It can be seen that the control variable C (Z) can be a function of the loss of the configuration information Z

Are closely related.

Step 304: a penalty value is determined based on the actual performance and the target performance of the configuration information sample.

The actual performance of the above configuration information samples may be generated based on a heuristic simulator or a simulator based on a proxy model, for example: the method comprises the steps of using a proxy model m (-) constructed by an existing MLP function library to input configuration information and application program characteristics into the proxy model m (-) constructed by the existing MLP function library to generate actual performance data of configuration information samples, and then using a loss function of the proxy model m (-) to calculate actual performance and target performance of the configuration information samples to determine loss values.

Taking the square root error as a loss function, substituting the actual performance of the configuration information samples and the target performance of the continuously distributed samples into the square root error loss function to obtain a loss value, wherein the square error loss function can be expressed as

wherein

Is a result of a performance prediction of a continuously distributed sample,

is a loss function.

Step 305: determining a gradient of the configuration information based on the loss value, the control variable, and a distribution value of the consecutive distribution samples.

In practical application, the embodiment of the invention aims to optimally construct the Relay gradient estimator aiming at the reconfigurable processor parameters, and the construction process of the Relay gradient estimator is essentially the expectation of minimizing the loss function

And in the derivation process of the hidden variable parameter phi, the optimal parameter phi can be searched by adopting gradient descent, and the optimal configuration information can be determined by utilizing the optimal parameter phi. Meanwhile, in order to reduce the parameter search variance, a control variable is introduced, and based on the control variable, the format of the Relax gradient estimator can be expressed as formula 2:

wherein ,

in order to be a function of the loss,

in order to configure the gradient of the information,

in order to be a function of the component,

is the expectation of the loss function.

As can be seen from equation 2, the calculation of the gradient is not required in the derivation

Only calculations are needed and therefore non-differentiable configuration parameters can be supported. The core is that a control variable C (Z) is added so that C (Z) and

and (4) close correlation, thereby reducing the variance to improve the search efficiency. The controlled variable method is an effective means for reducing variance in the monte carlo method, and can reduce the error of estimation of an unknown quantity through the knowledge of a known quantity, thereby improving the search quality.

The embodiment of the invention predicts the non-differentiable configuration information by using the proxy model constructed by the existing MLP function library, wherein the Gumbel-Softmax GS (zeta | Z) sampling method realizes that the configuration information Z is approximate to continuous distribution zeta, so that the continuous distribution sample zeta and the discrete distribution Z are closely related, namely zeta-GS (zeta | Z), and further the gradient estimation of the formula 2 can be rewritten as a formula 3:

wherein ,

in order to be a function of the loss,

in order to configure the gradient of the information,

in order to be a function of the component,

is the expectation of the loss function.

As can be seen from equation 3, the rewritten gradient formula satisfies the loss function

Expectation of

The derivative to the parameter phi, which consists of three parts: (i) the reinform is a reinforcement learning part and can sample z-p by a Gumbel-Softmax method _φ (z) and ζ to GS (ζ | z), and further calculated; (ii) correction is a correlation term between the control variable c (z) and the generation probability of the parameter phi configuration information, which can be calculated by using the Gumbel-Softmax function GS (ζ | z); (iii) the derivative generated by Gumbel-Softmax can be calculated by a re-parameterization skill method, and the specific calculation of the three parts can be obtained by an interface function and an automatic differential function library provided by a re-parameterization function library.

Step 306: and if the gradient of the configuration information Z meets the iteration triggering condition, updating the configuration information Z by adopting a back propagation algorithm, and if the gradient of the configuration information Z meets the iteration termination condition, determining the configuration information as the application configuration based on the gradient of the configuration information Z.

When the proxy model m (-) of the embodiment of the invention continuously distributes the performance prediction result of the sample, the proxy model m (-) is a trained proxy model, therefore, the parameter of the proxy model m (-) is fixed, then the generation probability parameter phi of the configuration information is continuously updated by adopting a gradient descent method, the configuration information Z is derived by utilizing the proxy model m (-) constructed by the existing MLP function library, the derivation result, namely the gradient of the configuration information Z is obtained until the iteration times is more than or equal to the preset times, the iteration is terminated, or the iteration is terminated when the gradient of the configuration information Z and the preset gradient are less than the preset threshold value. For example, in a plurality of consecutive iteration cycles, the gradient of the configuration information Z and the preset threshold of the preset gradient are within a preset range, and the gradient of the configuration information Z may be regarded as fluctuating around the preset threshold in the plurality of consecutive iteration cycles, and the iteration may be regarded as being terminated. The number of iteration cycles may be 2 or more.

In the embodiment of the invention, the control variable is introduced into the configuration information gradient, and the configuration information Z is continuously sampled, so that the proxy model m (-) constructed by using the existing MLP (multilayer perceptron) function library can obtain a performance prediction result based on continuously distributed samples obtained by continuous sampling, therefore, the performance prediction of an insurmountable discrete variable can be realized by the proxy model m (-) constructed by using the existing MLP (multilayer perceptron) function library, and the configuration information Z is updated in a gradient descending manner, thereby avoiding the problem of search failure based on a proxy method in the prior art. In addition, the estimation of the performance prediction result is used as a control variable, so that the control variable is closely related to a loss function, and the control variable is introduced into the gradient calculation of the configuration information, thereby reducing the sampling variance, improving the search efficiency and reducing the expense of the search process.

Specifically, the parameter generation optimization of reconfigurable processor configuration information is determined step by step in the iterative optimization of the ReLax estimator, and at each step of iteration, the distribution p of the configuration information Z is first generated by using the parameter φ (the generation probability of the configuration information) _φ (z), then generating GS sampling zeta, then calculating the performance prediction result m (zeta) and the loss value of the continuous distribution sample by using the proxy model m (-) constructed by the existing MLP function library, wherein the calculation of the loss value needs to be adjusted and designed according to the specific application requirement and can be a line of a plurality of performance indexesAnd (4) sexual combination. And then calculating a control variable C (Z), calculating the gradient of the configuration information according to a formula 3 by three parts, wherein the gradient can be obtained according to an interface function and an automatic differential function library provided by the existing re-parameterization function library, and finally updating a proxy model m (-) constructed by the existing MLP function library according to needs to continuously improve the prediction precision of the proxy model m (-) and realize the updating of a configuration information Z generation controller.

Tests prove that the searching can be completed within a few minutes under the condition that the precision of the proxy model m (-) of the embodiment of the invention is higher, and the searching speed is lost under the condition that the precision of the proxy model m (-) is not too high, but the whole searching process is kept full-automatic, so that the requirement of manually constructing an accurate differentiable proxy function is avoided, and the searching method can be quickly applied to a new design scene.

In the embodiment of the present invention, the terminal may be divided into the functional units according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, the division of the modules in the embodiments of the present disclosure is illustrative, and is only one division of logic functions, and there may be another division in actual implementation.

In the case of adopting each functional module divided corresponding to each function, the embodiment of the present disclosure provides an application configuration determining apparatus. Fig. 4 shows a block schematic diagram of modules of an application configuration determining apparatus of an embodiment of the present invention. As shown in fig. 4, the apparatus 400 for determining an application configuration includes:

the sampling module 401 is configured to sample configuration information of the reconfigurable processor in a continuous distribution sampling manner, and obtain continuous distribution samples of the configuration information and distribution values of the configuration information;

a determining module 402, configured to determine a performance prediction result of the continuous distribution sample based on a proxy model, determine a control variable based on the performance prediction result, determine a loss value based on an actual performance and a target performance of the configuration information sample, and determine a gradient of the configuration information based on the loss value, the control variable, and a distribution value of the continuous distribution sample;

an updating module 403, configured to update the configuration information by using a back propagation algorithm if the gradient of the configuration information satisfies an iteration trigger condition, and determine that the configuration information is an application configuration based on the gradient of the configuration information if the gradient of the configuration information satisfies an iteration termination condition.

In one embodiment, the continuous distribution mode is a Gumbel-Softmax sampling mode.

In one embodiment, the control variables satisfy:

wherein C (Z) is a control variable, Z is configuration information, parameter phi is the generation probability of the configuration information, GS (Zeta | Z) is generated by a Gumbel-Softmax sampling method for the configuration information, and p _φ (GS (ζ | z)) is a continuous distribution value generated by a Gumbel-Softmax sampling method for configuration information,

the Gumbel-Softmax sampling method is used to generate the expectation of continuous distribution for the configuration information,

and predicting the performance result.

In one embodiment, the determining module 402 is further configured to determine the actual performance of the sample of configuration information as a heuristic-based simulator or a proxy model-based simulator; or the like, or, alternatively,

the actual performance of the configuration information sample is the performance of the proxy model output by taking the configuration information and the application characteristics as input.

In one embodiment, the gradient of the configuration information satisfies a gradient formula, the gradient formula is a derivative formula of the expectation pair of the loss function of the proxy model to determine the target distribution probability of the configuration information, and the gradient formula 3 satisfies:

wherein ,

in order to be a function of the loss,

for the purpose of the gradient of the configuration information,

in order to be a function of the component,

for the expectation of the loss function, (i) the force is the reinforcement learning part, (ii) the correction is the correlation between c (z) the control variable and the parameter phi (the generation probability of the configuration information), (iii) Gumbel-Softmax is the derivative generated by Gumbel-Softmax.

In one embodiment, the iteration trigger condition is that the number of iterations is greater than or equal to a preset number; and/or the presence of a gas in the gas,

the iteration trigger condition is that the difference value between the gradient and a preset gradient is smaller than a preset threshold value.

In an embodiment, the proxy model m (-) is a trained proxy model, and the updating module 403 is configured to perform derivation on the configuration information Z by using the proxy model m (-) to obtain a derivation result, and update the configuration information Z based on the derivation result.

FIG. 5 shows a schematic block diagram of a chip of an embodiment of the invention. As shown in fig. 5, the chip 500 includes one or more than two (including two) processors 501 and a communication interface 502. The communication interface 502 may support the server to perform the data transceiving steps in the above method, and the processor 501 may support the server to perform the data processing steps in the above method.

Optionally, as shown in fig. 5, the chip 500 further includes a memory 503, and the memory 503 may include a read-only memory and a random access memory and provide the processor with operation instructions and data. The portion of memory may also include non-volatile random access memory (NVRAM).

In some embodiments, as shown in fig. 5, the processor 501 performs the corresponding operation by calling an operation instruction stored in the memory (the operation instruction may be stored in the operating system). The processor 501 controls the processing operation of any of the terminal devices, and may also be referred to as a Central Processing Unit (CPU). Memory 503 may include both read-only memory and random access memory and provides instructions and data to processor 501. A portion of the memory 503 may also include NVRAM. For example, in applications where the memory, communication interface, and memory are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 504 in fig. 5.

The method disclosed by the embodiment of the invention can be applied to a processor or realized by the processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an ASIC, an FPGA (field-programmable gate array) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present disclosure may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

An embodiment of the present invention further provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor, the computer program, when executed by the at least one processor, is operative to cause the electronic device to perform a method according to embodiments of the disclosure.

Embodiments of the present invention also provide a non-transitory computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is configured to cause the computer to perform a method according to an embodiment of the present disclosure. Referring to fig. 6, a block diagram of an electronic device that may be a server or a client of an embodiment of the present invention, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the electronic device 600 includes a computing unit 601, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Various components in the electronic device 600 are connected to the I/O interface 605, including: an input unit 606, an output unit 607, a storage unit 608, and a communication unit 609. The input unit 606 may be any type of device capable of inputting information to the electronic device 600, and the input unit 606 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device. Output unit 607 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Storage unit 604 may include, but is not limited to, magnetic or optical disks. The communication unit 609 allows the electronic device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers, and/or chipsets, such as bluetooth (TM) devices, WiFi devices, WiMax devices, cellular communication devices, and/or the like.

As shown in FIG. 6, computing unit 601 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 performs the respective methods and processes described above. For example, in some embodiments, the methods of the exemplary embodiments of the present disclosure may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 600 via the ROM 602 and/or the communication unit 609. In some embodiments, the computing unit 601 may be configured to perform the method by any other suitable means (e.g., by means of firmware).

Program code for implementing methods of embodiments of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of embodiments of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As used in accordance with embodiments of the present invention, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the procedures or functions described in the embodiments of the present disclosure are performed in whole or in part. The computer may be a general purpose computer, special purpose computer, computer network, terminal, user equipment, or other programmable device. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center by wire or wirelessly. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or an optical medium, such as a Digital Video Disc (DVD); it may also be a semiconductor medium, such as a Solid State Drive (SSD).

While the embodiments of the invention have been described in conjunction with specific features and embodiments thereof, it will be evident that various modifications and combinations can be made thereto without departing from the spirit and scope of the disclosure. Accordingly, the specification and figures are merely exemplary of the invention as defined in the appended claims and are intended to cover any and all modifications, variations, combinations, or equivalents within the scope of the invention. It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present disclosure and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for application configuration determination, the method comprising:

determining a control variable based on the performance prediction result;

and if the gradient of the configuration information meets an iteration triggering condition, updating the configuration information by adopting a back propagation algorithm, and if the gradient of the configuration information meets an iteration termination condition, determining the configuration information to be application configuration based on the gradient of the configuration information.

2. The method according to claim 1, wherein the continuous distribution mode is a Gumbel-Softmax sampling mode.

3. The application configuration determination method according to claim 1, wherein the control variables satisfy:

wherein C (Z) is a control variable, Z is configuration information, parameter phi is the generation probability of the configuration information, GS (Zeta | Z) is a continuous distribution sample generated by the configuration information by adopting a Gumbel-Softmax sampling method,

continuous distribution values generated by a Gumbel-Softmax sampling method are adopted for configuration information,

the results are predicted for the performance of the continuously distributed samples.

4. The application configuration determining method of claim 1, wherein the actual performance of the configuration information samples is a heuristic based simulator or a proxy model based simulator; or the like, or, alternatively,

5. The application configuration determination method according to claim 1, wherein the gradient of the configuration information satisfies a gradient formula, the gradient formula being a derivative formula of the expectation of the loss function of the proxy model to a target distribution probability that determines the configuration information, the gradient formula satisfying:

wherein ,

in order to be a function of the loss,

for the purpose of the gradient of the configuration information,

in order to be a function of the component,

to minimize the expectation of the loss function, (i) reFor reinforcement learning part (ii) correction is C (Z), a correlation relation item between a control variable and a generation probability parameter phi of configuration information (iii) Gumbel-Softmax is a derivative item generated by Gumbel-Softmax.

6. The application configuration determination method according to any one of claims 1 to 5, wherein the iteration trigger condition is that the number of iterations is greater than or equal to a preset number; and/or the presence of a gas in the gas,

7. The method according to any one of claims 1 to 5, wherein the agent model is a trained agent model, and the updating the configuration information by using a back propagation algorithm includes:

utilizing the proxy model to conduct derivation on the configuration information to obtain a derivation result;

updating the configuration information based on the derivation result.

8. An application configuration determining apparatus, characterized in that the application configuration determining apparatus comprises:

a determining module, configured to determine a performance prediction result of the continuously distributed samples based on a proxy model, determine a control variable based on the performance prediction result, determine a loss value based on actual performance and target performance of the continuously distributed samples of the configuration information, and determine a gradient of the configuration information based on the loss value, the control variable, and a distribution value of the continuously distributed samples of the configuration information;

9. An electronic device comprising a memory and a processor, the memory for storing computer instructions, wherein the computer instructions are executable by the processor to implement the method of any one of claims 1 to 7.

10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method according to any one of claims 1 to 7.