CN114942895B - Address mapping strategy design method based on reinforcement learning - Google Patents

Address mapping strategy design method based on reinforcement learning Download PDF

Info

Publication number
CN114942895B
CN114942895B CN202210714310.4A CN202210714310A CN114942895B CN 114942895 B CN114942895 B CN 114942895B CN 202210714310 A CN202210714310 A CN 202210714310A CN 114942895 B CN114942895 B CN 114942895B
Authority
CN
China
Prior art keywords
bim
reinforcement learning
strategy
network
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210714310.4A
Other languages
Chinese (zh)
Other versions
CN114942895A (en
Inventor
魏榕山
徐楠楠
陈家扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202210714310.4A priority Critical patent/CN114942895B/en
Publication of CN114942895A publication Critical patent/CN114942895A/en
Application granted granted Critical
Publication of CN114942895B publication Critical patent/CN114942895B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an address mapping strategy design method based on reinforcement learning. A binary invertible matrix (Binary Invertible Matrix, BIM) is used to represent the address mapping strategy of the main stream, and the best line cache hit rate address mapping strategy is trained in combination with a reinforcement learning model. The reversibility of the binary reversible matrix BIM enables the effective mapping of physical addresses and addresses of memory storage units, and the BIM has the irreplaceable advantages of flexible expression of address mapping strategies, low hardware overhead and the like.

Description

Address mapping strategy design method based on reinforcement learning
Technical Field
The invention relates to an address mapping strategy design method based on reinforcement learning.
Background
In a computer architecture, the performance improvement speed of a processor and the performance improvement speed of a memory are always unbalanced and developed, so that memory access delay becomes an important factor for limiting the performance improvement of a system. Since the problem of "memory wall" was proposed, hardware accelerator performance improvement in computer systems has long been one of the key research objectives in computer architecture, and memory controllers have been one of the key to improving accelerator performance. Memory controllers are optimized by students from various angles at home and abroad, and system delay is reduced. Most address mapping strategies have the problems of strong pertinence, incapability of being widely popularized to other applications and insufficient flexibility for realizing high-performance access in the accelerator special for the field.
Disclosure of Invention
The invention aims to provide an address mapping strategy design method based on reinforcement learning, which uses a binary reversible matrix to represent the address mapping strategy of a main stream and trains the address mapping strategy of the optimal line cache hit rate by combining with a reinforcement learning model.
In order to achieve the above purpose, the technical scheme of the invention is as follows: an address mapping strategy design method based on reinforcement learning uses a binary reversible matrix BIM to represent an address mapping strategy, and combines the reinforcement learning model to train the address mapping strategy with optimal line cache hit rate. The implementation mode is as follows: taking one-dimensional expansion of a binary reversible matrix BIM as input of a reinforcement learning model; taking the initial BIM line cache hit rate as the current optimal value H best of the reinforcement learning model; selecting actions by the reinforcement learning model according to the probability to obtain candidate BIMs; when the line cache hit rate obtained by calculating the candidate BIM is higher than that of the current BIM, the reinforcement learning model replaces the current BIM with the candidate BIM; then, recalculating the rewarding value and simultaneously updating the parameters of the reinforcement learning model; the reinforcement learning model is continuously iterated and optimized according to the process, and the trained BIM is obtained through convergence according to a preset stopping rule; and simultaneously, the address mapping strategy with the highest line cache hit rate is obtained.
Compared with the prior art, the invention has the following beneficial effects: the invention discloses an address mapping strategy design method based on reinforcement learning, which combines a binary reversible matrix BIM and reinforcement learning as an address mapping strategy design of a memory controller for the first time. The binary reversible matrix BIM has extremely high flexibility in the expression of the address mapping strategies, and can correctly represent all the current address mapping strategies. In addition, the invention combines a reinforcement learning model based on strategy gradient, so that BIM learns the address mapping strategy with highest line cache hit rate aiming at different access modes of the neural network accelerator. And implementing the trained and learned BIM model in hardware in the memory controller.
Drawings
Fig. 1 is a representation of an address mapping policy.
Fig. 2 is a schematic diagram showing a main stream address mapping policy by BIM.
FIG. 3 is a schematic diagram of reinforcement learning strategy network optimization BIM.
FIG. 4 is an optimized iterative BIM algorithm.
FIG. 5 is a Mini-batch training reinforcement learning model algorithm.
FIG. 6 is a schematic diagram of a reinforcement learning model system workflow.
Detailed Description
The technical scheme of the invention is specifically described below with reference to the accompanying drawings.
The invention relates to an address mapping strategy design method based on reinforcement learning, which uses a binary reversible matrix BIM to represent an address mapping strategy and combines a reinforcement learning model to train an optimal line cache hit rate address mapping strategy. The implementation mode is as follows: taking one-dimensional expansion of a binary reversible matrix BIM as input of a reinforcement learning model; taking the initial BIM line cache hit rate as the current optimal value H best of the reinforcement learning model; selecting actions by the reinforcement learning model according to the probability to obtain candidate BIMs; when the line cache hit rate obtained by calculating the candidate BIM is higher than that of the current BIM, the reinforcement learning model replaces the current BIM with the candidate BIM; then, recalculating the rewarding value and simultaneously updating the parameters of the reinforcement learning model; the reinforcement learning model is continuously iterated and optimized according to the process, and the trained BIM is obtained through convergence according to a preset stopping rule; and simultaneously, the address mapping strategy with the highest line cache hit rate is obtained.
The following is a specific implementation procedure of the present invention.
The principle of the memory address mapping strategy is that addresses and memory cell locations in the DRAM are mapped into specific locations of the DRAM according to certain rules. FIG. 1 illustrates the currently prevailing memory address mapping strategy, which is simplified into an 8-bit physical address by representing the DRAM's different address mapping strategy with simplified address bits. Wherein the first 2 bits are Bank bits, followed by 4-bit row bits, and the last 2 bits are column address bits. Fig. 1 (a) shows BRC, and the policy addresses are mapped to physical addresses in the order of Bank, row and column. Fig. 1 (b) shows RBC, which permutes the Bank bits and the row bits, and places the row bits before the Bank bits, with the column coordinate bits unchanged. Fig. 1 (c) shows bit inversion, i.e., the initial Bank bits and the row bits are arranged in reverse order. And FIG. 1 (d) shows Permutation-based, which exclusive OR the Bank bits with part of the row address bits to generate new Bank address bits. Fig. 1 (e) is a strategy of memory address mapping based on a binary invertible matrix (Binary Invertible Matrix, BIM), which multiplies an initial physical address and the binary invertible matrix to obtain address information of a corresponding BIM address mapping.
All the strategies described above can be represented by binary invertible matrix BIM. The policy implementation is to multiply the original address with BIM to get the required address mapping. The binary reversible matrix BIM consists of 1 or 0, so that the realization of the memory address mapping can be realized by hardware only by an and gate and an exclusive or gate. The AND gate and the exclusive OR gate are used for multiplication and addition operation respectively, and the process can effectively reduce the hardware overhead of memory address mapping. The reversibility of the binary invertible matrix enables an efficient mapping of physical addresses to addresses of memory cells. As shown in fig. 2, the main stream address mapping policies shown in fig. 2 (a) - (d) can each be expressed by BIM. Because BIM has the irreplaceable advantages of the performance, low hardware overhead and the like, the memory controller based on reinforcement learning has obvious advantages of selecting BIM as a carrier of a memory address mapping strategy in a system.
1. Reinforcement learning optimization BIM
The optimization of BIM in the invention mainly comprises the step of performing elementary matrix transformation on a binary identity matrix in a strategy gradient algorithm model. The action space of the reinforcement learning model is composed of all possible row/column switching actions of the binary invertible matrix.
(1) Strategic network design
In the invention, the action of optimizing BIM address mapping strategy with higher access efficiency is learned by using a strategy network pi. The strategy network is designed into two fully-connected layers in cascade, and a non-linear factor is introduced in the strategy network by taking a ReLU as an activation function in the first layer. The output of the second layer of the network is connected with the Softmax function in a fully-connected mode. As shown in fig. 3, is an example of BIM optimization. The design sequentially expands the binary reversible matrix BIM line by line into one-dimensional data serving as an input of a strategy network. Based on the probability distribution, the model will select an action in the action space as the current optimization action for the binary reversible BIM. The BIM is transformed according to the last action to become a new binary reversible matrix BIM, and the binary reversible matrix BIM at the moment is used as the input of the next moment strategy network. In the following example, BIM is simplified into a binary reversible matrix BIM of 6×6 as an address mapping policy, and optimization of BIM can be performed by selecting a corresponding row/column transformation according to the output of the model, so that the optimization is iterated and iterated in a loop.
(2) Motion space optimization
In the binary reversible matrix BIM model, the number of action spaces of the reinforcement learning model isWhere b is the row/column number of the binary invertible matrix. Assuming a binary invertible matrix of 32 x 32 for BIM, the total motion space is 992, i.e. the BIM transform has 992 transform choices at a time. When the training process of the reinforcement learning model requires multiple iterative learning, the action space of the optimization BIM is a very large search space, and in this case, the learning process is reversible, and the performance of the reinforcement learning model is reduced due to the too long search action. In order to solve the problem of overlarge action space, action space compression is performed in the BIM optimization process in the section.
From linear algebra knowledge, it is known that performing an infinite number of row/column exchanges on a binary invertible matrix can be implemented using a plurality of row exchanges. As shown in equation (1), row-switching BIM may be performed by multiplying a transposed matrix M pre on the left side of BIM; column swapping a BIM can be performed by multiplying a transposed matrix M post on the right side of the BIM.
The binary invertible matrix satisfies the switching law and the combining law, and a series of row/column transformations can be equivalently implemented by using the row transformation. Therefore, the present study compresses the motion space into a set of only row-transformed motions. The transformation expression is as follows:
After the above-mentioned compression of the motion space, the motion space has been reduced by half. To optimize the action search space to a greater extent, the present study emphasizes the transformation of BIM into the exchange of the first row and the other rows, with a total of b-1 possible actions. The feasibility foundation of the design is that no matter which two rows of BIM are exchanged, the exchange of the first row and the other two rows can be completed, so that the optimization result of BIM can be ensured not to be influenced. At the same time, the design also adds a hold action NOP in the action space. In summary, the action space optimized by the BIM model is finally optimized into b actions, and if b=32, the number of action spaces is 32.
(3) Iterative optimization
The reinforcement learning model optimizes the address mapping strategy BIM by iteration. Firstly, each row of BIM is unfolded into a one-dimensional matrix to be used as input, and meanwhile, the row cache hit rate H of an initial BIM address mapping strategy is tested and used as the current optimal value H best of the model. Setting k iterations to finish BIM optimization, and carrying out new row hit rate test on each iteration optimization result, wherein if the row hit rate is higher than H best, the BIM is used as the optimal address mapping strategy. The BIM is iteratively optimized in this way. The row hit rate H best also increases with iteration. Address mapping strategy BIM iterative optimization process pseudocode is shown in fig. 4.
2. Model training
In the model training process, the policy network generates the next action a t at the current moment, and the BIM at the current moment is converted into the BIM at the next moment according to the action. After k cycles have elapsed, the policy network obtains a prize value r k=Hk. The maximum jackpot value may be achieved through reinforcement learning. And meanwhile, the address mapping strategy based on BIM with the highest row hit rate can be obtained.
The present invention uses the strategic gradient algorithm mentioned above to iterate the optimization model. The formula for the jackpot value is:
R t=γk+1rk formula (3)
Wherein, gamma is a break factor. The cost function V φ(BIMt) is used primarily to predict the jackpot value, by means of a strategy gradient to update the neural network containing the parameter phi.
The value network and the strategy network intermediate structure are composed of two full-connection layers. The difference is that the output of the value network is a numerical value that describes the predicted jackpot value. The formula for the benefit of an action is used to represent the benefit of an agent selecting this action in the current context versus a policy network to randomly select an action. The specific formula is as follows:
A t=Rt-Vt formula (4)
The maximized objective function is:
The strategy gradient is as follows:
The loss function of the value network is:
The gradient of the value network is:
In the network model, gradient values of parameters calculated by using a back propagation algorithm, lr π and lr v are divided into learning rates of a policy network and a value network (specific formulas refer to fig. 5), and are set to 0.001 in this project.
The invention updates the model parameters according to the Mini-batch method. In the experiment, batch was set to 64, which means that in one Batch, the policy network would make 64 iterative updates. The 64 iterations obtain experience pools (actions, rewards, etc.) that are used to update parameters in the model. However, the Mini-batch method saves all the input data, the calculation result and other data, resulting in serious storage overhead. To solve this problem, the gradients of one Batch are accumulated as parameter gradients in the experiment, and the accumulated gradients of the strategy network and the value network are g θ and g φ (see fig. 5 for specific formulas). An algorithm for training the reinforcement learning model using the Mini-batch method is shown in FIG. 5.
3. Workflow process
The overall process of iterative optimization training of the present invention is shown in fig. 6. The binary reversible matrix BIM one-dimensional expansion of 32 x 32 is used as the input of the strategy network and the value network, and the foreterm derivation is carried out on the strategy network and the value network. The policy network training derivation process determines whether the row hit rate case selection updates the BIM. And selecting actions by the model according to the probability to obtain candidate BIMs. When the calculated line cache hit rate of the BIM is higher than the current BIM, the reinforcement learning model system replaces the current BIM with the candidate BIM. Then, the new BIM line cache hit rate is recalculated, i.e. the reward value is calculated, and the parameters of the two networks are updated. The system is continuously iterated and optimized according to the process, and can converge to obtain a trained BIM strategy according to a set stopping rule. Meanwhile, the address mapping strategy with the highest row hit rate can be obtained, and can be transplanted to the hardware implementation in FPGA MIG IP.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims (3)

1. The method is characterized in that a binary reversible matrix BIM is used for representing an address mapping strategy, and the best line cache hit rate address mapping strategy is trained by combining a reinforcement learning model; the method comprises the following specific implementation modes: taking one-dimensional expansion of a binary reversible matrix BIM as input of a reinforcement learning model; taking the initial BIM line cache hit rate as the current optimal value H best of the reinforcement learning model; selecting actions by the reinforcement learning model according to the probability to obtain candidate BIMs; when the line cache hit rate obtained by calculating the candidate BIM is higher than that of the current BIM, the reinforcement learning model replaces the current BIM with the candidate BIM; then, recalculating the rewarding value and simultaneously updating the parameters of the reinforcement learning model; the reinforcement learning model is continuously iterated and optimized according to the process, and the trained BIM is obtained through convergence according to a preset stopping rule; simultaneously obtaining an address mapping strategy with highest line cache hit rate; the action space number of the reinforcement learning model consists of all possible row/column switching actions of the binary reversible matrix BIM, namelyWhere b is the row/column number of the binary invertible matrix; in order to solve the problem of overlarge action space of the reinforcement learning model, the action space number of the reinforcement learning model is compressed, and the method specifically comprises the following steps:
as shown in the following equation, the row-switching is performed on the BIM by multiplying a transposed matrix M pre on the left side of the BIM; column swapping BIM is performed by multiplying a transposed matrix M post on the right side of BIM:
BIM satisfies the exchange law and the combination law, and a series of line/column transformations are equivalently realized by using line transformation; thus, the motion space is compressed into a set of only row-transformed motions; the transformation expression is as follows:
After the compression of the motion space, the motion space is reduced by half; to optimize the motion search space to a greater extent, the transformation of the binary reversible matrix BIM is forced to be the exchange of the first row and the other rows, the total possible motion is b-1; meanwhile, adding the held action NOP in the action space; the number of action spaces of the reinforcement learning model is finally optimized into b actions.
2. The method for designing an address mapping strategy based on reinforcement learning according to claim 1, wherein the reinforcement learning model is composed of a strategy network and a value network; the strategy network consists of two cascaded full-connection layers, wherein a first layer in the strategy network takes a ReLU as an activation function, and the output of a second layer of the strategy network is connected with a Softmax function in a full-connection mode; in the training process of the reinforcement learning model, the strategy network generates the next action a t at the current moment, and BIM at the current moment is transformed according to the action to generate BIM at the next moment; after the preset iteration number k, the strategy network obtains a reward value r k=Hk,Hk as the line cache hit rate of the BIM after k iterations; the formula for the jackpot value is:
Rt=γk+1rk
wherein, gamma is a break factor;
The value network is composed of two fully connected layers as well as the strategy network intermediate structure, and the difference is that the output of the value network is a numerical value for describing the predicted jackpot value; the formula of the advantage of the action is used for expressing the advantage of selecting the reward value of the corresponding action in the current environment relative to the strategy network to randomly select the action; the specific formula is as follows:
At=Rt-Vt
Wherein A t is an dominance function, and V t is a return value estimated after selecting actions according to a strategy pi in the state of s t;
The maximized objective function is:
Wherein J (theta) is a maximized objective function, and the maximized J (theta) is used for continuously optimizing the parameter theta of the neural network model; pi θ is a strategy gradient algorithm, which is a strategy that parameterizes the strategy pi to pi θ, i.e., learns in the corresponding environment to maximize the jackpot value; BIM t represents the current binary invertible matrix;
The policy gradient, i.e. the bias that maximizes the objective function, is calculated as:
The loss function of the value network is:
The gradient of the value network is:
A cost function V φ(BIMt) for predicting a jackpot value, updating the neural network containing the parameter phi by means of a strategy gradient;
In the reinforcement learning model, gradient values of parameters calculated by a back propagation algorithm are used, and lr π and lr v are learning rates of a strategy network and a value network, respectively.
3. The method of claim 2, wherein the parameters of the reinforcement learning model are updated according to the Mini-Batch method, and the gradient of one Batch is accumulated as the parameter gradient, and the cumulative gradients of the policy network and the value network are g θ and g φ.
CN202210714310.4A 2022-06-22 2022-06-22 Address mapping strategy design method based on reinforcement learning Active CN114942895B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210714310.4A CN114942895B (en) 2022-06-22 2022-06-22 Address mapping strategy design method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210714310.4A CN114942895B (en) 2022-06-22 2022-06-22 Address mapping strategy design method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN114942895A CN114942895A (en) 2022-08-26
CN114942895B true CN114942895B (en) 2024-06-04

Family

ID=82911016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210714310.4A Active CN114942895B (en) 2022-06-22 2022-06-22 Address mapping strategy design method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN114942895B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858396A (en) * 2020-07-27 2020-10-30 福州大学 Memory self-adaptive address mapping method and system
CN113568845A (en) * 2021-07-29 2021-10-29 北京大学 Memory address mapping method based on reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10452533B2 (en) * 2015-07-14 2019-10-22 Western Digital Technologies, Inc. Access network for address mapping in non-volatile memories

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858396A (en) * 2020-07-27 2020-10-30 福州大学 Memory self-adaptive address mapping method and system
CN113568845A (en) * 2021-07-29 2021-10-29 北京大学 Memory address mapping method based on reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向图像转置和分块处理的一种高效内存访问策略;沈煌辉;王贞松;郑为民;;计算机研究与发展;20130115(第01期);第188-196页 *

Also Published As

Publication number Publication date
CN114942895A (en) 2022-08-26

Similar Documents

Publication Publication Date Title
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
JP7451008B2 (en) Quantum circuit determination methods, devices, equipment and computer programs
CN113098714B (en) Low-delay network slicing method based on reinforcement learning
CN112491818B (en) Power grid transmission line defense method based on multi-agent deep reinforcement learning
CN109951392B (en) Intelligent routing method for medium and large networks based on deep learning
CN112613608A (en) Reinforced learning method and related device
CN117875397B (en) Parameter selection method and device to be updated, computing equipment and storage medium
CN114942895B (en) Address mapping strategy design method based on reinforcement learning
CN114254545A (en) Parallel connection refrigerator system load control optimization method, system, equipment and medium
CN112131089B (en) Software defect prediction method, classifier, computer device and storage medium
CN117520956A (en) Two-stage automatic feature engineering method based on reinforcement learning and meta learning
CN116502779A (en) Traveling merchant problem generation type solving method based on local attention mechanism
CN116185498A (en) Integrated memory and calculation chip, and calculation method and device thereof
CN109582911A (en) For carrying out the computing device of convolution and carrying out the calculation method of convolution
CN114444697A (en) Knowledge graph-based common sense missing information multi-hop inference method
CN110766133B (en) Data processing method, device, equipment and storage medium in embedded equipment
Tang et al. Modeling and optimization of a class of networked evolutionary games with random entrance and time delays
Ventura Quantum computational intelligence: answers and questions
CN116306948B (en) Quantum information processing device and quantum information processing method
CN117492371B (en) Optimization method, system and equipment for active power filter model predictive control
CN116151171B (en) Full-connection I Xin Moxing annealing treatment circuit based on parallel tempering
Li et al. A One-Shot Reparameterization Method for Reducing the Loss of Tile Pruning on DNNs
CN117775224A (en) Marine hybrid power energy management method and system based on MOEAD algorithm
CN118014054B (en) Mechanical arm multitask reinforcement learning method based on parallel recombination network
US11983606B2 (en) Method and device for constructing quantum circuit of QRAM architecture, and method and device for parsing quantum address data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant