CN117892667B

CN117892667B - Method for setting arithmetic unit chip, computing subsystem and intelligent computing platform

Info

Publication number: CN117892667B
Application number: CN202410295284.5A
Authority: CN
Inventors: 邓练兵; 巩志国; 官全龙
Original assignee: Guangdong Qinzhi Technology Research Institute Co ltd
Current assignee: Guangdong Qinzhi Technology Research Institute Co ltd
Priority date: 2024-03-15
Filing date: 2024-03-15
Publication date: 2024-06-04
Anticipated expiration: 2044-03-15
Also published as: CN117892667A

Abstract

The application belongs to the field of data processing, and particularly relates to an arithmetic unit chip setting method, a computing subsystem and an intelligent computing platform, wherein the method comprises the following steps: responding to a setting instruction of a target chip, and generating an initial chip population corresponding to the target chip; performing strategy optimization on the initial chip population by adopting a dynamic evaluation optimization model to obtain optimal initial individuals in the initial chip population; performing cross mutation operation on the optimal initial individuals to obtain a child chip population of the target chip; and performing iterative loop on the child chip population until child individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting chip parameters of the target chip based on the optimal child individuals in the last generation chip population. The method can improve the chip optimization efficiency, reduce the power consumption and the optimization energy efficiency of the arithmetic unit, realize the balance of low energy consumption and high performance for the arithmetic unit, and improve the operation efficiency of the equipment.

Description

Method for setting arithmetic unit chip, computing subsystem and intelligent computing platform

Technical Field

The application belongs to the field of data processing, and particularly relates to an arithmetic unit chip setting method, a computing subsystem and an intelligent computing platform.

Background

At present, in order to promote intelligent application popularity of each industry and each field, it is highly desirable to construct an intelligent computing platform for assisting in the construction of an intelligent supercomputer center, providing a construction foundation of an artificial intelligent platform for scientific research, industry and urban service, and further realizing talent aggregation, industry upgrading and development through the intelligent computing platform.

With the increasing application of computationally intensive tasks such as artificial intelligence, deep learning, etc., traditional high-energy-consumption devices have failed to meet the demands of environmental protection. The construction of the low-energy-consumption arithmetic unit is beneficial to reducing the power consumption of equipment, reducing the energy consumption of the whole data center, further saving energy, reducing carbon emission and promoting green sustainable development.

In the related art, the chip design of the low-power-consumption arithmetic unit involves optimization of a plurality of parameters, such as performance, power consumption, heat dissipation, and the like. The trade-off between different targets needs to be considered in the optimization process, so the difficulty of design optimization is great. The traditional manual adjustment method often cannot fully consider the mutual influence of a plurality of factors, and better parameters are difficult to obtain. In addition, as the functional requirements of chips increase, the complexity of the circuits increases, which results in increased power consumption. The power consumption not only can influence the endurance capacity of the equipment, but also can increase the heat dissipation requirement, and a complex heat dissipation system can also increase the complexity of design.

Therefore, how to improve the chip optimization efficiency and ensure the balance between low energy consumption and high performance is a technical problem to be solved urgently.

Disclosure of Invention

The application provides an arithmetic unit chip setting method, a computing subsystem and an intelligent computing platform, which are used for improving chip optimization efficiency, reducing power consumption and optimizing energy efficiency of an arithmetic unit, realizing balance of low energy consumption and high performance for the arithmetic unit and improving operation efficiency of equipment.

In a first aspect, the present application provides an operator chip setting method, the method comprising:

Responding to a setting instruction of a target chip, and generating an initial chip population corresponding to the target chip; wherein each initial individual in the initial chip population is used for representing an initialization setting state of the target chip, and each initialization setting state corresponds to a set of initial chip parameters of the target chip;

Performing strategy optimization on the initial chip population by adopting a dynamic evaluation optimization model to obtain an optimal initial individual in the initial chip population;

performing cross mutation operation on the optimal initial individuals to obtain a child chip population of the target chip; wherein each child individual in the child chip population is used for representing an iteration setting state of the target chip, and each iteration setting state corresponds to a set of iteration chip parameters of the target chip;

Performing iterative loop on the child chip population until child individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting chip parameters of the target chip based on the optimal child individuals in the last generation chip population; the chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure.

In a second aspect, embodiments of the present application provide a computing subsystem, the system comprising:

The generating unit is configured to respond to a setting instruction of a target chip and generate an initial chip population corresponding to the target chip; wherein each initial individual in the initial chip population is used for representing an initialization setting state of the target chip, and each initialization setting state corresponds to a set of initial chip parameters of the target chip;

the optimizing unit is configured to perform strategy optimization on the initial chip population by adopting a dynamic evaluation optimizing model so as to obtain an optimal initial individual in the initial chip population;

The iteration setting unit is configured to execute cross mutation operation on the optimal initial individual so as to obtain a child chip population of the target chip; wherein each child individual in the child chip population is used for representing an iteration setting state of the target chip, and each iteration setting state corresponds to a set of iteration chip parameters of the target chip; performing iterative loop on the child chip population until child individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting chip parameters of the target chip based on the optimal child individuals in the last generation chip population; the chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure.

In a third aspect, embodiments of the present application provide a computing device, the computing device comprising:

At least one processor, memory, and input output unit;

Wherein the memory is configured to store a computer program, and the processor is configured to invoke the computer program stored in the memory to perform the operator chip setting method of the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, comprising instructions that, when executed on a computer, cause the computer to perform the operator chip setting method of the first aspect.

In the technical scheme provided by the embodiment of the application, firstly, an initial chip population corresponding to a target chip is generated in response to a setting instruction of the target chip. Each initial individual in the initial chip population is used for representing one initialization setting state of the target chip, and each initialization setting state corresponds to one set of initial chip parameters of the target chip. And further, carrying out strategy optimization on the initial chip population by adopting a dynamic evaluation optimization model so as to obtain the optimal initial individuals in the initial chip population. Then, cross mutation operation is carried out on the optimal initial individuals so as to obtain a child chip population of the target chip. Each child individual in the child chip population is used for representing one iteration setting state of the target chip, and each iteration setting state corresponds to one group of iteration chip parameters of the target chip. Therefore, diversity and explorability are introduced into the chip setting scheme through the cross mutation operation, more possible solutions are found in the search space, the phenomenon that local optimal solutions are easily trapped in the related technology is avoided, global optimal solutions are obtained, the chip performance is further improved, and the balance between the performance and the power consumption is ensured. And finally, performing iterative loop on the sub-generation chip population until the sub-generation individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting the chip parameters of the target chip based on the optimal sub-generation individuals in the last-generation chip population. The process optimizes the population of the child chips through iterative loop, gradually improves the chip performance, finds the solution meeting the optimization target, further improves the chip performance, and ensures the balance between the performance and the power consumption. The chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure.

The technical scheme of the application provides an automatic and intelligent arithmetic unit chip setting flow, which realizes the parameter optimization of a target chip by dynamically evaluating the learning and optimization of an optimization model on the chip population and the cross variation and iteration of the chip population, reduces the workload and time consumption of manual design in the related technology, effectively avoids the problem of sinking into a local optimal solution, is beneficial to obtaining a global optimal solution, further improves the chip performance and the chip optimization efficiency, and ensures the balance of low energy consumption and high performance.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a flow chart of an arithmetic unit chip setting method according to an embodiment of the application;

FIG. 2 is a schematic diagram of an iterative loop method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a dynamic evaluation optimization model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a pre-configured layer according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a simulation parameter layer according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an evaluation policy network layer according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a computing subsystem according to an embodiment of the application;

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

In order to solve at least one of the above technical problems, an embodiment of the present application provides an operator chip setting method, a computing subsystem, and an intelligent computing platform.

Specifically, in the arithmetic chip setting scheme, first, an initial chip population corresponding to a target chip is generated in response to a setting instruction of the target chip. Each initial individual in the initial chip population is used for representing one initialization setting state of the target chip, and each initialization setting state corresponds to one set of initial chip parameters of the target chip. And further, carrying out strategy optimization on the initial chip population by adopting a dynamic evaluation optimization model so as to obtain the optimal initial individuals in the initial chip population. Then, cross mutation operation is carried out on the optimal initial individuals so as to obtain a child chip population of the target chip. Each child individual in the child chip population is used for representing one iteration setting state of the target chip, and each iteration setting state corresponds to one group of iteration chip parameters of the target chip. Therefore, diversity and explorability are introduced into the chip setting scheme through the cross mutation operation, more possible solutions are found in the search space, the phenomenon that local optimal solutions are easily trapped in the related technology is avoided, global optimal solutions are obtained, the chip performance is further improved, and the balance between the performance and the power consumption is ensured. And finally, performing iterative loop on the sub-generation chip population until the sub-generation individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting the chip parameters of the target chip based on the optimal sub-generation individuals in the last-generation chip population. The process can gradually optimize the chip performance and converge to a better solution, thereby further improving the chip performance and ensuring the balance between the performance and the power consumption. The chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure.

In the scheme of setting the arithmetic unit chip, an automatic and intelligent arithmetic unit chip setting flow is provided, the learning and optimization of the chip population and the cross variation and iteration of the chip population are realized by dynamically evaluating an optimization model, the parameter optimization of a target chip is realized, the workload and the time consumption of the artificial design in the related technology are reduced, the problem of sinking into a local optimal solution is effectively avoided, the global optimal solution is facilitated to be obtained, the chip performance and the chip optimization efficiency are further improved, and the balance of low energy consumption and high performance is ensured.

The scheme for setting the arithmetic unit chip provided by the embodiment of the application can be executed by an electronic device, and the electronic device can be a server, a server cluster and a cloud server. The electronic device may also be a terminal device such as a mobile phone, a computer, a tablet computer, a wearable device, or a dedicated device (e.g. a dedicated terminal device with an operator chip arrangement system, etc.). In an alternative embodiment, a service program for executing the operator chip set-up scheme may be installed on the electronic device.

Fig. 1 is a schematic diagram of an arithmetic unit chip setting method according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:

101, responding to a setting instruction of a target chip, and generating an initial chip population corresponding to the target chip.

In the embodiment of the application, each initial individual in the initial chip population is used for representing an initialization setting state of the target chip, and each initialization setting state corresponds to a set of initial chip parameters of the target chip. Specifically, if a target chip is to be designed, such as an embedded chip for smart home control, each initial individual may contain the following set of initial chip parameters:

Processor type: such as ARM Cortex-M4 or RISC-V architecture

Memory size: such as 4KB, 8KB or 16KB

Storage capacity: flash memory such as 128MB, 256MB or 512MB

Communication interface: such as Wi-Fi, bluetooth, zigbee, etc

Power consumption level: such as low power consumption, medium power consumption and high power consumption

These combinations of parameters in the initial individual described above represent the different initialization settings possible for the target chip. For each initial individual, these initial chip parameters collectively determine the characteristics and performance of the chip represented by the initial individual. In the optimization process, the parameters can be randomly generated, or can be set in combination with priori knowledge or experience, so that the individuals in the population can cover the diversity of the design space of the target chip, and a sufficient search space is provided for the subsequent optimization algorithm. By performing fitness evaluation and genetic manipulation on parameters in the initial individuals, individuals in the initial chip population can be continually improved and optimized until a chip design that best meets the target requirements is found. Thus, the initialization setting state represented by each initial individual is an exploration and representation of the target chip design space.

For example, assuming that a target chip, such as a low power operator chip, is designed, the goal is to balance performance, power consumption, and area, various tasks such as image processing, artificial intelligence, etc. may be performed. In this example, an initial chip population is generated from the target, each initial individual representing an initial set of states of the target chip, containing a set of initial chip parameters, as follows:

1. the initial chip parameters for individual 1 are as follows:

-processor type: ARM Cortex-A78

- GPU：Mali-G78

-A memory: 8GB LPDDR5

-Storing: 256GB UFS 3.1

-A process step: 7 nm

Performance: the single core performance is strong, and the method is suitable for processing image processing tasks

2. The initial chip parameters for individual 2 were as follows:

-processor type: qualcomm Snapdragon 888

- GPU：Adreno 660

-A memory: 12GB LPDDR5

-Storing: 512GB UFS 3.1

-A process step: 5 nm

Performance: excellent multi-core performance, and is suitable for artificial intelligent calculation

3. The initial chip parameters for individual 3 were as follows:

-processor type: apple A15 Bionic

- GPU：Apple GPU

-A memory: 16GB LPDDR5

-Storing: 1TB NVMe

-A process step: 5 nm

Performance: excellent overall performance, suitable for multitasking

By generating the initial chip population with different parameter combinations, different design schemes can be tried in the optimization process, individuals with higher fitness are selected for subsequent genetic operation by evaluating fitness parameters of each individual, and the design schemes which meet the target and are more excellent are finally obtained by continuous evolution and optimization, so that the chip design target meeting the requirements is met.

In another example, the initial chip population corresponding to the target chip may also be randomly generated by a preset mechanism. It is assumed that a target chip, i.e., an embedded chip for an internet of things device, is to be designed. The initial chip population corresponding to the target chip can be randomly generated through a preset mechanism, and the specific steps are as follows:

First, the size of the initial chip population is determined: the number of individuals of the initial chip population, for example 100 individuals, is determined based on design requirements and computational resources. Secondly, determining the range of the chip parameters: for each parameter of the target chip, such as the type of the processor, the memory, the power consumption and the like, the value range is determined. For example, the processor type may select ARM Cortex-M4 or RISC-V; the memory may be 4KB or 8KB, etc. Next, the initial individuals are randomly generated: initial individuals are randomly generated within the population scale based on the determined parameter ranges. Each individual represents an initial set-up state of the target chip. For example, ARM Cortex-M4 or RISC-V is randomly selected for the processor type; for memory, either 4KB or 8KB is randomly selected. And the like, repeating the step 3 until the population size is reached: and (3) circularly executing the step (3) to generate a designated number of initial individuals to form an initial chip population.

By means of the preset mechanism, a group of initial individuals with different parameter combinations can be randomly generated, and the individuals represent different initial setting states of the target chip. These individuals can be optimized and improved to gradually approach and exceed the design requirements of the target chip during subsequent iterations of the evolutionary algorithm.

102, Performing strategy optimization on the initial chip population by adopting a dynamic evaluation optimization model to obtain the optimal initial individuals in the initial chip population.

In this embodiment, a dynamic evaluation optimization model may be employed to strategically optimize the initial chip population to obtain optimal initial individuals in the population. First, the problem needs to be well defined. For example, an embedded chip is designed based on a balance among performance, power consumption, and area. This problem can be formulated as a dynamic evaluation optimization problem in which the parameters of the chip design are regarded as actions of the agent and the performance evaluation index of the target chip is regarded as a reward signal. Furthermore, a simulation environment is constructed to simulate the chip design and evaluation process. The environment can generate a target chip according to specific chip parameters and calculate indexes such as performance, power consumption, area and the like. Dynamic evaluation optimization algorithms, such as deep dynamic evaluation optimization networks, can then be employed to construct a model for learning and generating excellent chip design strategies. The model receives as input the state of the environment (i.e., the current chip design parameters) and outputs the next action (i.e., the new design parameter settings). Finally, iterative training can be performed using a dynamic evaluation optimization algorithm to optimize the policy model through interactions with the environment. In each iteration, an initial individual is selected as the current chip design parameter and a policy model is used to generate a new chip parameter setting. And then, applying the generated chip parameters to a simulation environment, calculating the performance index of the chip, and obtaining a reward signal. Parameters of the strategy model may be updated based on the reward signal to enable the model to better generate an excellent design strategy.

Through repeated iterative training, the performance of the strategy model can be gradually improved, and an optimal initial individual, namely, the chip design parameter setting with the optimal performance under given targets and constraint conditions, can be found.

For example, assuming that the goal is to design an embedded chip for an internet of things device, the processor type, memory size, storage capacity, etc. settings of the chip can be optimized by dynamically evaluating an optimization model. The dynamic evaluation optimization model may generate a target chip based on the chip parameters and optimize the selection of the chip parameters based on the chip performance evaluation index (e.g., power consumption, processing speed, etc.) as a reward signal. Through multiple iterative training, we can find the initial individual that is optimal for a given goal, i.e., the chip design parameter setting with the best performance and fitness. In this way, the best initial individual in the initial chip population that meets the requirements and has the best performance can be obtained.

103, Performing cross mutation operation on the optimal initial individual to obtain a child chip population of the target chip.

Wherein each child individual in the population of child chips is configured to represent an iteration set state of the target chip, each iteration set state corresponding to a set of iteration chip parameters of the target chip.

Where an iteration set state refers to the state or configuration that a particular system, tool, or model is in during an iteration. In the context of chip design or optimization, an iteration set state refers to a parameter, configuration, or setting used during each iteration that represents a particular state of a system, tool, or model during the iteration. For example, in optimizing a chip design, each iteration uses a particular set of chip parameter settings to generate and evaluate the chip design. The set of chip parameter settings constitutes the set state for this iteration. By continuously adjusting parameter settings and iterating, the chip design can be gradually optimized, and different setting states can be obtained in each iteration. Thus, an iteration set state refers to a particular parameter, configuration, or setting employed by a system, tool, or model at each iteration of the optimization process, corresponding to the system state and design scheme for that iteration stage.

Iterative chip parameters refer to chip design parameters that are used to adjust and optimize during iterations of chip design or optimization. In each iteration, the chip parameters are continuously modified and adjusted to obtain better performance, power consumption, area and other indexes, so as to gradually approach or reach the design target. For example, assuming that the design of an embedded chip is being optimized, the chip parameters may include processor type, frequency, cache size, memory capacity, hardware accelerator usage, and the like. In each iteration, we adjust the values or configurations of these chip parameters, generate a new chip design, and evaluate it for performance. And according to the evaluation result, feedback can be obtained, the chip parameters can be adjusted, and the next iteration is performed again.

By continuously adjusting the iteration chip parameters, the chip design can be gradually optimized in the continuous iteration process, the performance of the chip design is improved, the power consumption is reduced, the area is reduced and the like, and finally the target chip meeting the design requirements is obtained. Therefore, the iterative chip parameters are chip design parameters that need to be adjusted and optimized in each iteration during the chip design optimization process, for achieving progressive improvement and optimization of the chip design.

In this embodiment, a cross-mutation operation will be performed on the optimal initial individual to obtain a population of daughter chips of the target chip. Each child in the population of child chips will represent an iteration set state, i.e., a set of iteration chip parameters, for the target chip. First, by a crossover operation, two individuals will be randomly selected from the optimal initial individuals, and two new offspring individuals will be generated by exchanging some of their chip parameters. The interleaving operation may be performed by different strategies, such as single-point interleaving, multi-point interleaving, or uniform interleaving. Specifically, one crossover point may be randomly selected and the chip parameters of the two individuals may be swapped at that crossover point to generate two new offspring individuals.

Random changes or alterations are introduced into the offspring individuals by mutation manipulation. The mutation operation generates new offspring individuals by changing certain chip parameters in the individuals. For example, one or more chip parameters may be randomly selected and subjected to small-magnitude random variations, such as adding or subtracting values within a small variation range, to generate new offspring individuals.

Through crossover and mutation operations, new variations and diversity can be introduced on the basis of the optimal initial individuals, thereby generating a population of daughter chips. Each child generation of individuals in the child chip population represents the setting state of the target chip after one iteration is performed, namely one group of iteration chip parameters. These parameters may be inherited from the parent or may be subject to crossover and mutation operations to introduce new changes. In this way, the chip parameter settings can be continually optimized and improved during each generation of iterations to gradually approach and exceed the design requirements of the target chip.

It should be noted that in practical application, when the crossover and mutation operations are performed, it is necessary to ensure that the generated population of daughter chips still meets the requirements of the target chip design according to the requirements of specific problems and constraint conditions.

104, Performing iterative loop on the sub-chip population until sub-individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting the chip parameters of the target chip based on the optimal sub-individuals in the last generation chip population.

In the embodiment of the application, the preset iteration stop condition is a termination condition set in the optimization algorithm, and once the condition is met, the optimization algorithm stops iteration. The condition may be that the number of iterations reaches a preset maximum value, or that the variation of the chip performance index is smaller than a certain threshold, or that an optimization target preset by a user is reached, or the like. In practical applications, the selection of this condition is set by the user according to the specific situation.

Chip optimization targets, which are targets that need to be achieved in the iterative process of the optimization algorithm, include, but are not limited to, performance improvement, power consumption reduction, area reduction, stability improvement, and the like. In chip design, specific optimization objectives will vary according to the specific application scenario and requirements of the chip. For example, for an embedded system chip, the optimization goal may be to improve performance and reduce power consumption; for communication chips, optimization may be aimed at improving communication rate and stability. In the optimization algorithm, the chip optimization objective is generally converted into a specific performance index or objective function, which is used for measuring and measuring the quality of each chip individual.

Therefore, in the optimization algorithm, the selection of the preset iteration stop condition and the setting of the chip optimization target are very important, and they directly affect the convergence of the optimization result and the final optimization effect.

In particular, for an operator chip, possible chip optimization objectives include, but are not limited to, the following:

First, the operation speed is increased: the arithmetic chip is mainly used for performing various computing operations, and in practical applications, the operation speed is generally one of the most important indicators. Therefore, the optimization target may be to increase the operation speed, adopt a more efficient algorithm, or increase the operation efficiency by increasing the parallelism and the like.

Second, power consumption is reduced: with the widespread use of mobile devices and embedded systems, reduction of power consumption has become a very important optimization objective for an operator chip. By adopting the low-power design technology, the static power consumption, the dynamic power consumption and the like of the chip are reduced, the energy-saving performance of the chip can be improved, and the battery life of the device can be prolonged.

Thirdly, stability is improved: in some high-precision computing scenarios, the stability of the operator chip is an important optimization objective. For this purpose, various ways may be used to improve the accuracy and stability of the chip, such as adding error correction codes, using redundant designs, improving power stability, etc.

Fourth, the area is reduced: for miniaturized arithmetic chips, area is also often an important optimization objective. The area of the chip is reduced by means of compressing the logic structure, reducing the number of the flippers, optimizing the internal wiring and the like, so that the manufacturing cost can be reduced and the integration level can be improved.

In summary, when the arithmetic unit chip is optimized, a suitable optimization target can be selected and set according to specific requirements, so as to improve indexes in aspects of performance, stability, energy-saving performance, cost and the like of the chip.

The chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure. In particular, chip parameters are important factors in determining chip design and performance, and their optimization and improvement can effectively improve chip performance and functionality. Chip parameters include many aspects, such as chip architecture, logic cells, circuit wiring structures, and the like. For example, a chip structure refers to the basic physical structure and hierarchy of a chip, including the overall chip size, hierarchy, layout of functional units, and so on. For example, the structural parameters of an embedded chip may include the total area of the chip, the hierarchical structure, and the arrangement of functional modules. For example, the logic unit is a basic constituent unit in a chip for realizing various logic functions, including logic gates, timing units, memory units, and the like. When designing a chip, the kind and number of logic units need to be reasonably selected and optimized and adjusted appropriately. For example, the logic parameters of a Digital Signal Processor (DSP) chip may include the number, type, and manner of accelerator use of the operators, etc. For example, a circuit wiring structure refers to an interconnection structure and a wiring arrangement between individual logic devices within a chip. These parameters include the wiring density, line width, line spacing, number of metal layers, etc. of the circuit, directly affecting the speed and power consumption of signal transmission. For example, circuit routing parameters of a high-speed communication chip may include transmission distance of each channel, layout and routing of a circuit board, and the like.

As an alternative embodiment, in 104, performing an iterative loop on the daughter chip population, as shown with reference to fig. 2, may be implemented as:

201, acquiring an adaptability parameter of each child individual in the child chip population;

202, continuing to execute cross mutation operation on the optimal offspring individual with the highest fitness parameter so as to obtain the next generation chip population of the target chip;

203, circularly executing the step of obtaining the fitness parameter of each child individual in the next generation chip population, and continuously executing the cross mutation operation on the optimal child individual with the highest fitness parameter to obtain the next generation chip population until the child individual meeting the chip optimization target is selected, or when a preset iteration stop condition is reached, stopping the iterative computation.

In step 201, fitness parameters of each child generation of individuals in the child chip population are obtained, which helps to evaluate performance of each child generation of individuals in the chip design process. In the embodiment of the application, the fitness parameter can be an evaluation index aiming at the aspects of chip performance, power consumption, area and the like, and is used for quantifying the quality of chip design. That is, the fitness parameter is an evaluation index for measuring the fitness or dominance of an individual in an evolutionary algorithm or optimization problem. In the optimization process, each individual is assigned an fitness parameter value reflecting its performance in the solution space. The fitness parameter is typically defined based on a specific goal or constraint of the problem, and may be a single indicator or a combination of indicators.

In practical application, the fitness parameter is used for evaluating the quality degree of an individual in the current chip design process. By calculating the fitness parameters of the individuals, the performance of the individuals can be quantitatively compared, and the individuals can be judged to be better or more suitable for the target requirements in the solution space of the problem. In evolutionary algorithms, fitness parameters are used as a basis for selection operations to determine which individuals can replicate or pass on to the next generation. In general, individuals with higher fitness will have a higher chance to be selected as parents, thus combining their genotype with other individuals results in excellent offspring. The fitness parameter is a driving factor in the optimization process, so that the optimization algorithm is advanced towards a better solution. By performing operations such as mutation, crossover and the like on individuals with low fitness, the optimization algorithm searches for a better solution by generating new individuals, thereby continuously improving the value of the fitness parameter in the optimization process.

Continuous optimization of the current optimal solution can be achieved by continuing the cross mutation operation on the optimal offspring individuals with the highest fitness parameters in steps 202 and 203. Thus, even if the current optimal solution has been found, the dynamic process of improving and optimizing it is maintained in the next generation chip population.

Wherein, the evaluation index of the fitness parameter at least comprises one of the following: the target chip has clock frequency, logic delay, static power consumption, dynamic power consumption, wafer area, on-chip memory area, and error rate. Clock frequency refers to the number of clock cycles that a chip is operating and directly affects the speed of operation of the chip. In some high performance application scenarios, the increase in clock frequency can significantly improve the operating efficiency and speed of the chip. The logic delay refers to the delay time of the chip in the process of logic calculation or signal transmission. Lower logic delays can generally increase the computation speed and response speed of the chip. The static power consumption is the fixed power consumption of the chip in the working process and does not change along with the change of the working state. The lower static power consumption is beneficial to reducing the total power consumption of the chip and improving the energy-saving performance, and is particularly important to application scenes such as mobile equipment, wireless sensors and the like with limited resources. Dynamic power consumption is the power consumed by a chip in performing logic switching and signal transmission. The lower dynamic power consumption can reduce the heat generation and the power consumption of the chip and improve the battery endurance time of the chip. The wafer area is the physical space occupied by the chip and is directly related to the manufacturing cost and the integration level of the chip. Smaller wafer area may reduce manufacturing costs and increase chip integration. On-chip memory area refers to the footprint of memory in a chip for storing data and instructions. Smaller on-chip memory area can reduce the area of the chip and increase the access efficiency and capacity of the memory. The error rate refers to the probability of a chip generating an error during operation. The lower error rate can improve the reliability and stability of the chip, and is particularly important in some application scenes with higher requirements on the accuracy of the calculation result.

It should be noted that in the chip optimization, a suitable fitness parameter can be selected according to specific application requirements and optimization targets, so as to evaluate the advantages and disadvantages of the individual chip, and the suitable individual chip can be selected by using optimization methods such as genetic algorithm, optimization algorithm and the like so as to achieve the optimization targets.

For example, assume that in a population of daughter chips, it is estimated that individual A has the best fitness parameter, i.e., the best performance in the current design space. By continuing the cross-mutation operation on individual a in steps 202 and 203, individual a may be further improved and optimized to remain a leading position in the next generation chip population, possibly with higher fitness parameters. In this way, the whole iterative loop process continuously optimizes the sub-generation population, and finally, the optimal sub-generation individuals meeting the chip optimization target can be obtained under the given stop condition.

Further alternatively, the population of daughter chips is calculated using the formulaIndividuals of intermediate offspring/>Adaptation parameter/>I.e.

；

Wherein,For the daughter chip population/>Iteration number of/>In order to evaluate the total number of indicators,Representing offspring individuals/>In/>Fitness parameter scores on the respective evaluation indexes,/>Representing offspring individualsIn/>Actual score on each evaluation index,/>Represents the/>Minimum reference value of each evaluation index,/>Represents the/>Maximum reference value of each evaluation index,/>Represents the/>Weight coefficient on each evaluation index.

In the embodiment of the application, the parameter optimization of the target chip is realized by dynamically evaluating the learning and optimization of the optimization model on the chip population, the cross variation and iteration of the chip population, the workload and the time consumption of the artificial design in the related technology are reduced, the problem of sinking into the local optimal solution is effectively avoided, the global optimal solution is facilitated to be obtained, the chip performance and the chip optimization efficiency are further improved, and the balance of low energy consumption and high performance is ensured.

In the above or the following embodiments, it is assumed that the dynamic evaluation optimization model includes at least: the system comprises a pre-configuration layer, a simulation parameter layer, an evaluation strategy network layer, a value function network layer and an output layer.

Illustratively, each of these sections will be described below. In the dynamic evaluation optimization model, a pre-configuration layer, a simulation parameter layer, an evaluation strategy network layer, a value function network layer and an output layer play different roles, and the specific description is as follows:

The pre-configuration layer is used for mapping the initialization setting state of each initial individual in the initial chip population into the simulation environment state space so as to obtain the simulation environment state parameters of each initial individual. The pre-configuration layer helps map the parameter configuration of the initial chip individual to the simulation environment for subsequent simulation and evaluation.

And the simulation parameter layer obtains the running condition of the simulation chip of each initial individual by running the simulation environment state parameters in the simulation environment state space. This layer utilizes the simulated environment to simulate the operation of the chip, helping to evaluate the performance of each individual in the virtual environment.

The evaluation strategy network layer adopts a preset rewarding feedback module to circularly calculate the accumulated rewarding return value of each initial individual based on the running condition of the simulation chip. The jackpot reward value comprises a chip performance evaluation value, a static power consumption evaluation value, a dynamic power consumption evaluation value, an area evaluation value and the like, and is used for evaluating the advantages and disadvantages of each individual.

The value function network layer performs policy updating on the reward feedback module using the simulated state cost function based on the cumulative reward value for each initial individual. This layer evaluates and updates each individual with a cost function to optimize the selection of the optimal strategy.

And the output layer selects an initial individual with the highest cumulative rewards return value from the initial chip population as the optimal initial individual according to the cumulative rewards return value obtained in the last cycle. The output layer is used for outputting an optimization result, and an initial individual with better performance in the simulation environment is selected as an optimal solution.

Through the dynamic evaluation optimization model of the hierarchical structure, the initial chip population can be optimized through operation and evaluation in a simulation environment, so that an optimal initial individual is obtained, and a preset optimization target and an evaluation index are met.

Based on the above-mentioned hypothesis structure, in step 102, a dynamic evaluation optimization model is adopted to perform policy optimization on the initial chip population to obtain an optimal initial individual in the initial chip population, as shown in fig. 3, including the following steps:

301, mapping an initialization setting state of each initial individual in the initial chip population into a simulation environment state space through a pre-configuration layer to obtain a simulation environment state parameter of each initial individual;

302, running the simulation environment state parameters in a simulation environment state space through a simulation parameter layer to obtain the simulation chip running condition of each initial individual;

303, circularly calculating a cumulative rewards return value of each initial individual by using a preset rewards feedback module based on the running condition of the simulation chip through an evaluation strategy network layer until the set circulation times are reached or a preset convergence condition is reached;

304, through a value function network layer, based on the accumulated rewards return value of each initial individual, performing strategy updating on the rewards feedback module by adopting a simulation state cost function;

305, selecting an initial individual with the highest cumulative rewards return value from the initial chip population as the optimal initial individual according to the cumulative rewards return value obtained in the last cycle through an output layer.

In an embodiment of the present application, the jackpot return value is an index for evaluating each individual in the dynamic evaluation optimization model, where the index relates to various aspects of the chip, and at least includes: chip performance evaluation value, static power consumption evaluation value, dynamic power consumption evaluation value, and area evaluation value.

The chip performance evaluation value refers to the performance that the chip can achieve when executing tasks, such as running speed, and the amount of data that can be processed. The performance of a chip is often one of the important indicators in chip design. The static power consumption evaluation value refers to the power consumption of a chip in an idle stage, and the static power consumption directly affects the total power consumption and the power consumption of the chip, so that the static power consumption is one of indexes which are often considered when the chip is optimized. The dynamic power consumption evaluation value refers to the power consumed by the chip in running tasks. The dynamic power consumption of a chip is a relationship reflecting the performance and power consumption of the chip, and is generally one of important indexes in chip design. The area evaluation value refers to the size (area) of the physical space occupied by the chip, and the area of the chip directly relates to the manufacturing cost and the integration level of the chip.

By evaluating each individual through the four indexes, the method can help to measure the advantages and disadvantages of different designs and optimize the optimal design scheme. Other factors such as reliability of the chip design, stability and fault tolerance of the design, etc. can also be considered in the evaluation process to more fully evaluate the advantages and disadvantages of each individual.

Through the steps 301 to 304, each individual in the initial chip population can be comprehensively evaluated and optimized, so that the individual with the best performance is finally selected as the optimal initial identification individual, a better design scheme and a better result are provided for chip design, more effective power consumption prediction and optimization are realized, and the energy efficiency performance and the user experience of the system are improved.

As an optional embodiment, in 301, the mapping, by the pre-configuration layer, the initialization setting status of each initial individual in the initial chip population into the simulated environment status space to obtain a simulated environment status parameter of each initial individual, as shown in fig. 4, further includes the following steps:

401, performing state coding on the initialization setting state of each initial individual to obtain a corresponding initialization setting state vector of each initial individual in a simulation environment state space;

402, mapping each initialization setting state vector to a simulation environment state space to perform parameter simulation processing, so as to obtain a simulation environment state parameter of each initial individual.

In steps 401 and 402, first, the initialization status of each initial individual is status-coded to obtain a corresponding initialization status vector of each individual in the simulated environment status space. The effect of this step is to convert each individual initialization setting state into a specific vector representation for convenient processing and analysis in the simulated environment state space. Further, each initialization setting state vector is mapped to a simulation environment state space to perform parameter simulation processing, so as to obtain a simulation environment state parameter of each initial individual. The effect of this step is to convert the encoded initialization setting state vector into parameter values in the simulation environment, thereby accurately simulating the state of each individual in the simulation environment.

In the embodiment of the application, the initial individualInitialization setting state vector/>The acquisition process of (1) is expressed as the following formula:

；/>

Wherein, Represents the/>Initial individuals,/>Representation of/>State vector obtained after coding of each initial individual,/>Represents the/>Simulated environmental state vectors corresponding to the initial individuals,/>Representation is based on state vector/>Solving for the corresponding initialization setting state vector/>Is a computational process function of/>Represents the/>Performance index of the individual initial individuals,/>Represents the/>Static Power consumption of the individual initial individuals,/>Represents the/>Dynamic power consumption of individual initial individuals,/>Then express the/>Area of the individual initial individuals.

Through the combination of the two steps, the initialization setting state of each individual in the initial chip population can be converted into the parameter representation in the simulation environment state space, so that accurate input is provided for subsequent simulation and evaluation. The process converts the abstract initialization setting state of the initial individual into the specific operational simulation environment state parameters, provides basis and convenience for subsequent evaluation and strategy optimization, and is helpful for more accurately evaluating the performance of each individual and promoting the performance of strategy optimization.

Further optionally, the simulation parameter layer includes at least: simulator, monitoring simulator. The simulator is a computer program for simulating the running state of a chip, and has the main functions of converting the simulation environment state parameters into input parameters required by the running of the simulator, and running the simulator to simulate the running of the chip on the input parameters to obtain the running condition of the chip of each initial individual, wherein the conditions comprise indexes of running speed, processing capacity, performance and the like, and the indexes are used for calculating the cumulative rewards return value.

The monitoring simulator is a computer program for monitoring the running state of the chip and extracting key index values, and has the main functions of monitoring the running state of the chip of each initial individual output by the simulator and extracting key performance index data. The monitoring simulator can realize monitoring and extraction of various indexes, such as clock frequency, logic delay, static power consumption, dynamic power consumption and the like, according to different types of indexes required to be extracted.

By introducing the simulation parameter layer, the chip running condition of each initial individual can be simulated and monitored in the virtual environment, and relevant chip performance and key index data are output, so that the performance and anti-interference capability of each individual can be evaluated, and important data support and feedback are provided for a subsequent optimization scheme. In practical applications, the simulation parameter layer can also be expanded into other specific simulation and monitoring modules to meet specific design and evaluation requirements.

In the step 302, the simulation environment state parameters are run in the simulation environment state space through the simulation parameter layer to obtain the simulation chip running condition of each initial individual, as shown in fig. 5, and further includes the following steps:

501, converting the simulation environment state parameters of each initial individual into input parameters required by the simulator in operation;

502, performing chip operation simulation on the input parameters by an operation simulator to obtain a chip operation state of each initial individual;

503, monitoring the chip operation state of each initial individual by using a monitoring simulator so as to obtain chip operation monitoring data of each initial individual.

Wherein, the chip operation monitoring data at least comprises: clock frequency, logic delay, static power consumption, dynamic power consumption, die area, on-chip memory area, and error rate. Similar to the previous description, a detailed description is omitted here.

For example, assume that there is an initial chip unit with some initialization state parameters such as voltage, clock frequency, router settings, etc. These parameters can be converted into input parameters required for simulator operation. In step 501, the simulated environmental state parameters of each initial individual are converted into input parameters required for the simulator to operate. These parameters may include digital-to-analog converter (DAC) settings, transport protocol settings, hardware accelerator settings, etc. These parameters will be translated into an input format that the simulator can understand in order to perform the chip run simulation. Further, in step 502, the operation simulator performs chip operation simulation on the input parameters. In the simulator, input parameters are to be used for simulating the operation state of the chip, such as transmission characteristics of analog circuits, logic operations, signal processing, etc. Through the simulator, the chip running state data corresponding to each initial individual, such as power consumption, time sequence, signal strength and the like, can be obtained. Next, in step 503, a monitoring simulator is used to monitor the chip operation status of each initial individual. The monitoring simulator can monitor the state of the chip in real time in the running process and record monitoring data such as clock frequency, logic delay, static power consumption, dynamic power consumption and the like. These data will provide detailed information about the initial individual chip operation.

By integrating the steps, the complete process from the parameter setting of the initial individual to the simulator operation and then to the monitoring of the simulator monitoring data can be realized, so that the chip operation state and performance data of each initial individual can be obtained. These data are important for subsequent evaluation and optimization, and can help determine the best initial individual and optimize the design.

In step 303, by evaluating the policy network layer, based on the running condition of the analog chip, a preset reward feedback module is adopted to circularly calculate the cumulative reward value of each initial individual until the set number of loops is reached or a preset convergence condition is reached, as shown in fig. 6, further including the following steps:

601, calculating an instant rewards return value of each initial individual by the rewards feedback module based on chip operation monitoring data of each initial individual.

The instant rewards return value is based on the evaluation result of the chip running state data. By analyzing key data such as performance indexes, power consumption indexes and the like of the chip, an instant rewarding return value can be calculated according to a preset rewarding rule. Further optionally, the instant prize return value includes at least: performance index rewards and power consumption index rewards.

Taking the design of an image processor as an example, it is described how to calculate the instant prize return based on chip operation monitoring data. Assuming an initial individual whose design parameters include clock frequency (Fclk), algorithm complexity (C), and resource usage (a), in steps 501, 502, and 503, chip operation monitoring data for the chip under the given design parameters has been obtained. Further, the instant prize return value of the individual may be calculated by a predetermined prize feedback module as follows:

Performance index rewards: assuming that it is desired that the performance index (such as image sharpness and color rendition) of the image processor is as high as possible, a performance index bonus rule may be set, for example: if the processing Speed (SP) is higher than a preset threshold (SPt), giving a performance index reward of 0.8; if the algorithm complexity (C) is higher than a preset threshold (Ct), a performance index prize of 0.5 is awarded.

Assuming that the initial individual is processed at 100fps, the algorithm complexity is 2, and SPt and Ct are 90fps and 3, respectively, the individual's performance index prize value is 0.8 minutes.

Power consumption index rewards: assuming that it is desirable that the power consumption of the image processor is as low as possible, we can set a power consumption index rewarding rule such as: if the power consumption (P) is below a preset threshold (Pt), a power consumption index prize of 0.5 is awarded.

Assuming that the chip power consumption of the initial individual is 1.5W and Pt is 2W, the power consumption index prize value of the individual is 0.5 point. The individual's immediate prize return value is 0.8 + 0.5 = 1.3 points, combined with the above calculation rules. Through the above calculation process, an immediate prize return value for each initial individual may be obtained, which may be used for subsequent jackpot prize return value calculations and to optimize decision making.

At 602, a cumulative calculation is performed on the instant prize return value for each initial individual and the historical cumulative prize return value obtained in the previous iteration period to obtain a cumulative prize return value for each initial individual.

In step 602, the instant prize return value for each initial individual is cumulatively calculated with the historical jackpot prize return value obtained in the previous iteration cycle to obtain a jackpot prize return value for each initial individual. By means of the accumulation calculation, the influence of the instant rewards reward value can be accumulated, and therefore overall performance of an individual is better comprehensively considered. Doing so may provide more accurate reference data for subsequent optimization and decision making.

Further alternatively, the jackpot return value may be calculated by summing the product of the discount factor and the historical jackpot return value with the instant jackpot return value. Further, the jackpot return value is the sum of a product of a discount factor and the historical jackpot return value, and the instant jackpot return value. Thus, the introduction of the discount factor may take into account the correlation between the importance of the instant prize return value and the time of day when calculating the jackpot return value, thereby more reasonably evaluating the jackpot return value for each initial individual.

Through the above steps, the jackpot return value for each initial individual can be calculated in a round-robin fashion to assess the performance and merits of the individual and to guide the next optimization and adjustment. Through continuous iteration and optimization, the overall performance and efficiency of the chip can be gradually improved.

In the above embodiment, the first equation in the following equation set represents the calculated prize prediction error, that is, the error between the prize value and the predicted prize value at that time. The second expression represents the parameters of the function approximation updated by the gradient descent method according to the reward prediction error at this time, thereby achieving the effect of updating the simulation state cost function. That is, the parameter update formula of the simulation state cost function is expressed as:

；

Wherein, Representing a prize prediction bias representing an error between an actual instant prize return value and a predicted instant prize return value,/>State variables/>, representing the simulated state cost functionImmediate rewards return value obtained by taking a certain parameterRepresenting the discount factor,/>State variables/>, representing the cost function at the simulated stateUnder prediction state value parameter,/>Another state variable/>, representing a cost function in said simulated stateUnder prediction state value parameter,/>Is learning rate,/>Parameters representing the approximation of the simulation state cost function,/>Representation of parameters/>Calculating result of gradient solving of lower prediction state value parameter,/>Representing the function approximation parameters,/>Parameters for updating the simulated state cost function based on the reward prediction bias.

In summary, through the steps, each individual in the initial chip population can be comprehensively evaluated and optimized, so that the individual with the best performance is finally selected as the optimal initial identification individual, a better design scheme and a better result are provided for chip design, more effective power consumption prediction and optimization are realized, and the energy efficiency performance and the user experience of the system are improved.

In yet another embodiment of the present application, there is also provided a computing subsystem, as shown in FIG. 7, comprising the following units:

Further optionally, the iteration setting unit performs an iteration loop on the child chip population, specifically configured to:

Acquiring the fitness parameter of each child individual in the child chip population;

Continuously executing cross mutation operation on the optimal offspring individuals with the highest fitness parameters to obtain the next generation chip population of the target chip;

And circularly executing the steps of acquiring the fitness parameters of each child individual in the next generation chip population, and continuously executing the cross mutation operation on the optimal child individual with the highest fitness parameter to acquire the next generation chip population until the child individual meeting the chip optimization target is selected or a preset iteration stop condition is reached, and stopping iterative computation.

；

Wherein,For the daughter chip population/>Iteration number of/>In order to evaluate the total number of indicators,Representing offspring individuals/>In/>Fitness parameter scores on the respective evaluation indexes,/>Representing offspring individualsIn/>Actual score on each evaluation index,/>Represents the/>Minimum reference value of each evaluation index,/>Represents the/>Maximum reference value of each evaluation index,/>Represents the/>Weight coefficients on the individual evaluation indexes;

Wherein, the evaluation index of the fitness parameter at least comprises one of the following: the target chip has clock frequency, logic delay, static power consumption, dynamic power consumption, wafer area, on-chip memory area, and error rate.

Further optionally, the dynamic evaluation optimization model includes at least: the system comprises a pre-configuration layer, a simulation parameter layer, an evaluation strategy network layer, a value function network layer and an output layer;

The optimizing unit adopts a dynamic evaluation optimizing model to carry out strategy optimization on the initial chip population so as to obtain the optimal initial individuals in the initial chip population, and is specifically configured to:

mapping the initialization setting state of each initial individual in the initial chip population into a simulation environment state space through a pre-configuration layer to obtain simulation environment state parameters of each initial individual;

Operating the simulation environment state parameters in a simulation environment state space through a simulation parameter layer to obtain the operation condition of a simulation chip of each initial individual;

Through an evaluation strategy network layer, based on the running condition of the simulation chip, a preset rewarding feedback module is adopted to circularly calculate the accumulated rewarding return value of each initial individual until the set circulation times or the preset convergence condition is reached; the jackpot return value includes at least: chip performance evaluation value, static power consumption evaluation value, dynamic power consumption evaluation value and area evaluation value;

Performing strategy updating on the reward feedback module by adopting a simulation state cost function based on the accumulated reward value of each initial individual through a value function network layer;

And selecting an initial individual with the highest cumulative rewards return value from the initial chip population as the optimal initial individual according to the cumulative rewards return value obtained in the last cycle through an output layer.

Further optionally, the optimizing unit maps, through a pre-configuration layer, an initialization setting state of each initial individual in the initial chip population into a simulated environment state space to obtain a simulated environment state parameter of each initial individual, and specifically is configured to:

Carrying out state coding on the initialization setting state of each initial individual to obtain a corresponding initialization setting state vector of each initial individual in a simulation environment state space;

mapping each initialization setting state vector to a simulation environment state space respectively for parameter simulation processing so as to obtain simulation environment state parameters of each initial individual;

Wherein the initial individual Initialization setting state vector/>The acquisition process of (1) is expressed as the following formula:

；

Further optionally, the simulation parameter layer includes at least: a simulator, a monitoring simulator; the optimizing unit is used for running the simulation environment state parameters in the simulation environment state space through the simulation parameter layer so as to obtain the simulation chip running condition of each initial individual, and is specifically configured to:

converting the simulation environment state parameters of each initial individual into input parameters required by the simulator in operation;

the operation simulator carries out chip operation simulation on the input parameters so as to obtain the chip operation state of each initial individual;

monitoring the chip running state of each initial individual by adopting a monitoring simulator so as to obtain chip running monitoring data of each initial individual;

wherein, the chip operation monitoring data at least comprises: clock frequency, logic delay, static power consumption, dynamic power consumption, die area, on-chip memory area, and error rate.

Further optionally, the optimizing unit, through an evaluation policy network layer, based on the operation condition of the analog chip, adopts a preset reward feedback module to circularly calculate a cumulative reward value of each initial individual until a set number of times of circulation is reached or a preset convergence condition is reached, and is specifically configured to:

Calculating an instant rewards return value of each initial individual by the rewards feedback module based on chip operation monitoring data of each initial individual; wherein the instant prize return value comprises at least: performance index rewards and power consumption index rewards;

Performing cumulative calculation on the instant reward value of each initial individual and the historical cumulative reward value obtained in the previous iteration period to obtain the cumulative reward value of each initial individual; the jackpot return value is the sum of the product of a discount factor and the historical jackpot return value, and the instant jackpot return value.

Further optionally, the parameter update formula of the analog state cost function is expressed as:

；

According to the embodiment of the application, the chip optimization efficiency can be improved, the power consumption and the energy efficiency of the arithmetic unit can be reduced, the balance between low energy consumption and high performance can be realized for the arithmetic unit, and the operation efficiency of the equipment can be improved.

In yet another embodiment of the present application, there is also provided an intelligent computing platform, including: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

A memory for storing a computer program;

And the processor is used for realizing the method for setting the arithmetic unit chip according to the embodiment of the method when executing the program stored in the memory.

The communication bus 1140 referred to above for electronic devices may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The communication bus 1140 may be divided into an address bus, a data bus, a control bus, and the like.

Illustratively, it is assumed that a large-scale, autonomously controllable intelligent computing platform based on a neural network dedicated chip needs to be built for providing a hardware basis for developing and building the intelligent computing platform. Meanwhile, the intelligent computing platform can also provide a hardware foundation for the construction of an intelligent supercomputer center, and the construction of the center can be used for artificial intelligent platforms for scientific research, industry and urban service, and gathering talents and developing industry.

Specifically, the intelligent computing platform mainly comprises: the intelligent computing cloud system comprises an intelligent hardware platform, an intelligent computing cloud operating system, application environment development, a big data platform and an intelligent application PaaS platform. In the intelligent hardware platform, based on the intelligent computing theory, the deep learning chip, the AI intelligent accelerator card and the distributed server can be integrated into the intelligent hardware platform, so that basic hardware support is provided for the whole super computing platform and related derivative platforms, and the main content of the intelligent hardware platform comprises the following four parts: the intelligent computing subsystem, the network switching subsystem, the data storage subsystem and the support management subsystem.

Further alternatively, the intelligent computing subsystem is a hardware module that takes on computation, mainly from a dedicated server that builds a low-energy-consumption arithmetic unit, sparsely accesses a memory DMA (Direct Memory Access ), deep learning processor cache structure, deep learning memory consistency, artificial intelligent processor card design, and loads an intelligent processing card.

The embodiment of the application provides an arithmetic unit chip setting method for constructing a low-energy-consumption arithmetic unit.

For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.

The communication interface 1120 is used for communication between the electronic device and other devices described above.

Memory 1130 may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatil ememory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor 1110 may be a general-purpose processor including a Central Processing Unit (CPU)

Cessing Unit, CPU), network processor (Network Processor, NP), etc.; but may also be a digital signal processor (DIGITAL SIGNAL Processing, DSP), application specific integrated circuit (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.

Accordingly, the present application also provides a computer readable storage medium storing a computer program, where the computer program is executed to implement the steps executable by the electronic device in the above method embodiments.

Claims

1. An operator chip setting method, comprising:

Performing iterative loop on the child chip population until child individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting chip parameters of the target chip based on the optimal child individuals in the last generation chip population; the chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure;

the dynamic evaluation optimization model at least comprises: the system comprises a pre-configuration layer, a simulation parameter layer, an evaluation strategy network layer, a value function network layer and an output layer;

The method for optimizing the strategy of the initial chip population by adopting the dynamic evaluation optimization model to obtain the optimal initial individuals in the initial chip population comprises the following steps:

Selecting an initial individual with the highest cumulative rewards return value from the initial chip population as the optimal initial individual according to the cumulative rewards return value obtained in the last cycle through an output layer;

Mapping, by the pre-configuration layer, the initialization setting state of each initial individual in the initial chip population into a simulated environment state space to obtain a simulated environment state parameter of each initial individual, including:

The process of obtaining the initialization setting state vector s _m of the initial individual m is expressed as the following formula:

s_m＝f(e_m)＝[p_m,E_stastic(m),E_dynamic(m),A(m)]

Wherein m represents an mth initial individual, E _m represents a state vector obtained after the mth initial individual is coded, s _m represents a simulation environment state vector corresponding to the mth initial individual, f (E _m) represents a calculation process function for obtaining a corresponding initialization setting state vector s _m based on the state vector E _m, p _m represents a performance index of the mth initial individual, e_static (m) represents a static power consumption of the mth initial individual, e_dynamic (m) represents a dynamic power consumption of the mth initial individual, and a (m) represents an area of the mth initial individual;

the simulation parameter layer at least comprises: a simulator, a monitoring simulator;

The step of operating the simulation environment state parameters in the simulation environment state space through the simulation parameter layer to obtain the simulation chip operation condition of each initial individual comprises the following steps:

wherein, the chip operation monitoring data at least comprises: clock frequency, logic delay, static power consumption, dynamic power consumption, die area, on-chip memory area, error rate;

the method for calculating the cumulative rewards of each initial individual by the evaluation strategy network layer and adopting a preset rewards feedback module based on the running condition of the simulation chip until the set circulation times or the preset convergence condition is reached comprises the following steps:

Performing cumulative calculation on the instant reward value of each initial individual and the historical cumulative reward value obtained in the previous iteration period to obtain the cumulative reward value of each initial individual; the jackpot return value is the sum of the product of a discount factor and the historical jackpot return value and the instant jackpot return value;

the parameter updating formula of the simulation state cost function is expressed as:

δ＝R(s)+γ*V(s')-V(s)

Δθ＝α*ΔθV(s)*δ

Wherein δ represents a reward prediction bias representing an error between an actual instant reward value and a predicted instant reward value, R(s) represents an instant reward value obtained by taking a certain parameter under a state variable s of the simulated state value function, γ represents the discount factor, V(s) represents a predicted state value parameter under a state variable s of the simulated state value function, V (s ') represents a predicted state value parameter under another state variable s' of the simulated state value function, α is a learning rate, θ represents a parameter approximated by the simulated state value function, Δθv(s) represents a calculation result of performing gradient solving on the predicted state value parameter under the parameter θ, Δθ represents a function approximated parameter for updating the parameter of the simulated state value function based on the reward prediction bias.

2. The operator chip set-up method of claim 1, wherein said performing an iterative loop on said population of daughter chips comprises:

3. The method of setting an arithmetic unit chip according to claim 2, wherein the fitness parameter f (x) of the child individual x in the child chip population P ^(t+1) is calculated by using the following formula

Wherein (t+1) is the iteration number of the child chip population P ^(t+1), n is the total number of evaluation indexes,Representing fitness parameter scores of child individuals x on the ith evaluation index, and F _i (x) represents actual scores of child individuals x on the ith evaluation index,/>Minimum reference value representing i-th evaluation index,/>The maximum reference value of the ith evaluation index is represented, and w _i represents the weight coefficient on the ith evaluation index;

4. A computing subsystem, the computing subsystem comprising:

The iteration setting unit is configured to execute cross mutation operation on the optimal initial individual so as to obtain a child chip population of the target chip; wherein each child individual in the child chip population is used for representing an iteration setting state of the target chip, and each iteration setting state corresponds to a set of iteration chip parameters of the target chip; performing iterative loop on the child chip population until child individuals meeting the chip optimization target are selected, or when a preset iteration stop condition is reached, setting chip parameters of the target chip based on the optimal child individuals in the last generation chip population; the chip parameters at least comprise a chip structure, a logic unit and a circuit wiring structure;

s_m＝f(e_m)＝[p_m,E_stastic(m),E_dynamic(m),A(m)]

δ＝R(s)+γ*V(s')-V(s)

Δθ＝α*ΔθV(s)*δ

5. An intelligent computing platform, the intelligent computing platform comprising:

At least one processor, memory, and input output unit;

Wherein the memory is for storing a computer program, and the processor is for calling the computer program stored in the memory to execute the operator chip setting method according to any one of claims 1 to 3.