CN116341286B

CN116341286B - Acceleration quantum heuristic solving method and device based on FPGA

Info

Publication number: CN116341286B
Application number: CN202310586541.6A
Authority: CN
Inventors: 苗子博; 刘梓璇; 潘宇; 崔巍
Original assignee: Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Current assignee: Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Priority date: 2023-05-24
Filing date: 2023-05-24
Publication date: 2023-08-25
Anticipated expiration: 2043-05-24
Also published as: CN116341286A

Abstract

The invention relates to an acceleration quantum heuristic solving method and a device thereof based on FPGA, wherein the method comprises the following steps: mapping the optimization problem to be solved to two-dimensional I Xin Moxing; calculating the hamiltonian amount of the i Xin Moxing in the original state; updating the spintrons of the two-dimensional I Xin Moxing to a state to be confirmed, and calculating the Hamiltonian quantity of the Icine model in the state to be confirmed; calculating the difference value of the Hamiltonian quantity of the isooctane model in the state to be confirmed and the Hamiltonian quantity in the original state; deciding whether to transition the i Xin Moxing from an original state to a new state; and repeating the annealing step until the two-dimensional isooctyl model reaches a preset ending condition, wherein all spin sub-states of the two-dimensional isooctyl model are optimal solutions of the optimization problem. The method organically combines the parallelism of the FPGA with the quantum heuristic algorithm, realizes the acceleration of the quantum heuristic algorithm, and obtains the combined optimization problem acceleration solver.

Description

Acceleration quantum heuristic solving method and device based on FPGA

Technical Field

The invention belongs to the field of FPGA algorithms, and particularly relates to an acceleration quantum heuristic solving method and device based on an FPGA.

Background

The combinatorial optimization problem plays an important role in key technology and industrial applications such as machine learning, chip design and data mining, and its mathematical expression usually involves a large amount of information that is random, dynamic and accompanied by uncertainty. The opportunities and challenges of solving such optimization problems motivate the scientific community and industry to develop more and more research interests and enthusiasm for related students. A meta-heuristic algorithm represented by ant colony optimization, such as a genetic algorithm, a simulated annealing algorithm, etc., can still give a solution meeting the condition even for the actual problem with higher dimension, and even can find the optimal solution of the problem under certain special conditions. However, while improving the resolution, there is a significant cost, i.e., an exponential increase in the resolution time. The most common method for solving the combination optimization problem is to use the Monte Carlo algorithm on a large high-performance classical computer, however, the high cost brought by improving the problem solving precision is still unavoidable.

Disclosure of Invention

The invention provides an acceleration quantum heuristic solving method and device based on FPGA, which aim to solve at least one of the technical problems existing in the prior art. The FPGA-based acceleration quantum heuristic solving method and the FPGA-based acceleration quantum heuristic solving device can organically combine parallelism of the FPGA and the quantum heuristic algorithm, so that acceleration of the quantum heuristic algorithm is realized, and a combined optimization problem acceleration solver is obtained.

The technical scheme of the invention relates to an acceleration quantum heuristic solving method based on an FPGA, which comprises the following steps:

s100, mapping an optimization problem to be solved to a two-dimensional Italian Xin Moxing, wherein the two-dimensional Italian model comprises a plurality of spinners, each spinner comprises a respective spin state, the spin states are binary variables, and the spin state of each spinner is initialized at an initial temperature;

s200, calculating the Hamiltonian quantity of the two-dimensional I Xin Moxing in an original state, wherein the original state is the spin state of each spin in the current two-dimensional Icine model;

s300, updating the spins of the two-dimensional I Xin Moxing to a state to be confirmed, and calculating Hamiltonian quantity of the two-dimensional Ictan model in the state to be confirmed, wherein the state to be confirmed is the spin state of each spin in the two-dimensional Ictan model after updating;

s400, calculating a difference value between the Hamiltonian quantity of the two-dimensional isooctane model in a state to be confirmed and the Hamiltonian quantity in an original state;

s500, judging whether the two-dimensional IQ Xin Moxing is converted into a new state from an original state according to the difference value between the Hamiltonian amount of the two-dimensional IQ model in the state to be confirmed and the Hamiltonian amount in the original state, and if the two-dimensional IQ Xin Moxing is converted into the new state, determining that the Hamiltonian amount in the new state is the Hamiltonian amount of the state to be confirmed;

and S600, repeating the steps S300 to S500 until the two-dimensional isooctyl model reaches a preset end condition, wherein all spin sub-states of the two-dimensional isooctyl model are optimal solutions of the optimization problem.

Further, in the step S200, the hamiltonian amount is:

wherein,,

e is Hamiltonian quantity of the two-dimensional I Xin Moxing, N is the number of spinners contained in the two-dimensional I Xin Moxing, i and J respectively represent the serial numbers of the spinners, i is more than 0 and less than or equal to N, J is more than 0 and less than or equal to N, J _ij Is the interaction coefficient between spintrons, h _i The i-th spin is subjected to an external magnetic field.

Further, the step S500 further includes:

s510, if the difference value between the Hamiltonian quantity of the two-dimensional isooctane model in the state to be confirmed and the Hamiltonian quantity in the original state is smaller than zero, the two-dimensional Yi Xin Moxing is accepted to be converted into a new state from the original state;

and S520, if the difference value between the Hamiltonian quantity of the two-dimensional isooctane model in the state to be confirmed and the Hamiltonian quantity in the original state is larger than zero, judging whether the state to be confirmed of the two-dimensional isooctane model meets a judgment condition, if the judgment condition is met, accepting the two-dimensional Yi Xin Moxing to be converted from the original state to a new state, and if the judgment condition is not met, keeping the two-dimensional Yi Xin Moxing in the original state.

Further, in the step S520, the decision condition is:

r＜e ^-βΔE ，

wherein Δe is the difference between the hamiltonian amount of the two-dimensional isooctane model in the to-be-confirmed state and the hamiltonian amount in the original state, and β=1/k _B ，k _B Is Boltzmann constant, r is a random number, and r is more than or equal to 0 and less than or equal to 1.

Further, in the step S600, the preset end condition is:

the temperature of the two-dimensional isooctyl model is reduced to a preset termination temperature or the times of repeatedly executing the steps S300 to S500 reach a preset maximum execution times.

The invention also provides an acceleration quantum heuristic solving device based on the FPGA, which is used for realizing the acceleration quantum heuristic solving method based on the FPGA, and comprises the following steps:

the state change calculation module comprises a first Hamiltonian volume calculation block, a first inverter, a second Hamiltonian volume calculation block and an adder, wherein the first Hamiltonian volume calculation block, the first inverter and the first input end of the adder are sequentially connected, the second Hamiltonian volume calculation block is connected with the second input end of the adder, the spin state of a spin when the input end of the first Hamiltonian volume calculation block is in an original state, and the input end of the second Hamiltonian volume calculation block is in a spin state of a spin to be confirmed;

the judging module comprises a negative value judging device and a random judging device, wherein the first input end of the negative value judging device and the first input end of the random judging device are respectively connected with the output end of the adder;

the spin state trigger comprises a second selecting module, wherein the second selecting module comprises a first OR gate, a second inverter, a first AND gate, a second OR gate and a trigger, the output end of the negative value judging device is connected with the first input end of the first OR gate, the output end of the random judging device is connected with the second input end of the first OR gate, the output end of the first OR gate is connected with the input end of the second inverter, the output end of the second inverter is connected with the second input end of the first AND gate, the first input end of the first AND gate is connected with the input end of the first Hamiltonian amount calculating block, the output end of the first AND gate is connected with the first input end of the second OR gate, the second input end of the second AND gate is connected with the input end of the second Hamiltonian amount calculating block, the output end of the second AND gate is connected with the output end of the second OR gate, and the spin state trigger is in the state when the spin state trigger is in the state.

Further, the first hamiltonian calculation block includes a first accumulator module and the second hamiltonian calculation block includes a second accumulator module.

Further, the negative value judging device comprises a first input end, a second input end and an output end, wherein the first input end of the negative value judging device is connected with the output end of the adder, the second input end of the negative value judging device is connected with a zero value, and the output end of the negative value judging device is connected with the first input end of the first OR gate.

Further, the random judgment device comprises a lookup table, a random number generator and a positive value judgment device, wherein the output end of the adder is connected with the first input end of the lookup table, the output end of the lookup table is connected with the first input end of the positive value judgment device, the output end of the random number generator is connected with the second input end of the positive value judgment device, and the output end of the positive value judgment device is connected with the second input end of the first OR gate.

Further, the clock input terminal is also included, and the second input terminal of the lookup table and the second input terminal of the trigger are respectively connected with the clock input terminal.

Compared with the prior art, the invention has the following characteristics:

the invention can organically combine the parallelism of the FPGA with the quantum heuristic algorithm, thereby realizing the acceleration of the quantum heuristic algorithm and obtaining the combined optimization problem acceleration solver.

Drawings

Fig. 1 is a flow chart of an accelerated quantum heuristic solution method based on an FPGA.

Fig. 2 is a flowchart of determining whether to update a state in an acceleration quantum heuristic solving method based on an FPGA.

Fig. 3 is a schematic diagram of an algorithm flow in an accelerated quantum heuristic solving method based on an FPGA.

Fig. 4 is a schematic diagram of an accelerated quantum heuristic solver based on an FPGA.

Fig. 5 is a main implementation module in an acceleration quantum heuristic solving device based on an FPGA.

Fig. 6 is a schematic diagram of a Galois-type LFSR in an FPGA-based accelerated quantum heuristic solver.

Fig. 7 is a schematic diagram of a state machine structure in an acceleration quantum heuristic solving device based on FPGA.

Fig. 8 is a schematic diagram of state transition of a state machine in an acceleration quantum heuristic solver based on FPGA.

Fig. 9 is a schematic diagram of an ader 32 module in an accelerated quantum heuristic solver based on an FPGA.

Fig. 10 is a graph of time to process TTS on different problems in an FPGA-based accelerated quantum heuristic solver.

In the above diagram, 100, a state change calculation module; 110. a first hamiltonian calculation block; 120. a first inverter; 130. a second hamiltonian calculation block; 140. an adder; 200. a judging module; 210. a negative value judging device; 220. a random judgment device; 221. a look-up table; 222. a random number generator; 223. a positive value determiner; 300. selecting one module; 310. a first OR gate; 320. a second inverter; 330. a first AND gate; 340. a second AND gate; 350. a second or gate; 360. a trigger.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The conception, specific structure, and technical effects produced by the present invention will be clearly and completely described below with reference to the embodiments and the drawings to fully understand the objects, aspects, and effects of the present invention.

It should be noted that, unless otherwise specified, when a feature is referred to as being "fixed" or "connected" to another feature, it may be directly or indirectly fixed or connected to the other feature. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the description presented herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any combination of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element of the same type from another. For example, a first element could also be termed a second element, and, similarly, a second element could also be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. Further, as used herein, the industry term "pose" refers to the position and pose of an element relative to a spatial coordinate system.

Referring to fig. 1 to 10, an embodiment of the present invention provides an acceleration quantum heuristic solving method based on FPGA, including the following steps:

The invention aims to find the minimum energy value E corresponding to the Ising problem Xin Moxing Ising of the mapping optimization problem, namely the optimal solution of the problem by simulating an annealing algorithm. Therefore, the invention provides the light quantum heuristic solver based on the FPGA, which can organically combine the parallelism of the FPGA and the quantum heuristic algorithm, thereby realizing the acceleration of the quantum heuristic algorithm and obtaining the combined optimization problem acceleration solver.

Specifically, referring to fig. 1, for step S100, the optimization problem to be solved is mapped to two-dimensional i Xin Moxing, and the initialization state is S ₀ Initial temperature T ₀ 。

In a specific embodiment, the problem of Isacing belongs to the NPC (Non-deterministic Polynomial complete) problem, and the problem of NPC proposed by Lucas is exemplified by the Partition problem map, which is illustrated by the example:

given a set s= { n ₁ ，...，n _N Elements in set S are positive numbers. Assume that there is a partition that divides S into 2 disjoint subsets R and S-R such that the sum of the elements within the two subsets is equal. Mapping the Partition problem into two-dimensional Ising Xin Moxing Ising, and the Hamiltonian quantity E of the two-dimensional Ising Xin Moxing Ising is shown as the formula:

wherein n is _i (i=1,..n= |s|) represents an element in the set S, a>0，s _i ＝±1。

Obviously, if a solution exists such that e=0, i.e. the sum of spin corresponding to a value of-1 is equal to n _i The solution corresponding to the sum of spin values of +1, i.e., e=0, is equivalent to the solution of the Partition problem. From this, a mapping from the Partition problem to the oising problem is completed.

In another specific embodiment, mapping of the problem Binary Integer LinearProgramming with binary integer linear programming:

binary integer linear programming Binary Integer Linear Programming problem description:

there is an N x 1 vector x,

wherein x= (x ₁ ，...，x _N )’，x ₁ ....x _N Are binary variables, and have constraint equations

Sx＝b

Where S is an m N matrix and b is an m 1 vector.

For vector c, let cx take the maximum value, assuming vector x is stored. Mapping the NP problem into the hamiltonian E of two-dimensional ife Xin Moxing Ising, two-dimensional ife Xin Moxing Ising, let e=e _A +E _B The method comprises the following steps:

wherein A and B are constants greater than 0 and satisfy B<<Conditions of A, when E _A When the minimum value of (a) is 0, the constraint equation sx=b is just satisfied, so that solving the binary integer linear programming problem is converted into solving the minimum value of the two-dimensional i Xin Moxing Ising Hamiltonian quantity E.

Referring to fig. 1, S300, updating the spins of the two-dimensional eosin Xin Moxing to a state to be confirmed, and calculating a hamiltonian amount of the two-dimensional eosin model in the state to be confirmed, where the state to be confirmed is a spin state of each of the spins in the two-dimensional eosin model after being updated; updating the state S of the spin by referring to Metropolis criterion, setting the state of the spin in the current two-dimensional Ising Xin Moxing Ising as i, generating a new state j from the current state, and respectively setting the corresponding system Hamiltonian quantities as E _i ，E _j Calculating Hamiltonian difference delta E=E of the system after spin is overturned along a certain direction at the current temperature T _j -E _i 。

Further, referring to fig. 1, in the step S200, the hamiltonian amount is:

wherein,,

In particular, I Xin Moxing Ising was originally intended to represent crystalline magnetic properties by varying the binary variable s _i Assignment of = ±1, which characterizes spin direction of spin under the action of external magnetic field. Binary variable s _i Representing a spin, subscript i representing the position of the spin in the d-dimensional regular lattice, then s _i The value of (2) represents the external magnetic field intensity h of the ith spin _i Spin direction under action. If there are n=l×l spin in two-dimensional ifer Xin Moxing Ising, each spin interacts with spin in its nearest neighbor. And (3) finding a minimum energy value E corresponding to the Ising problem Xin Moxing of the mapping optimization problem through a simulated annealing algorithm, namely, an optimal solution of the problem.

Further, referring to fig. 1 and 2, the step S500 further includes:

Further, referring to fig. 2, in the step S520, the decision condition is:

r＜e ^-βΔE ，

wherein ΔE is the two-dimensionalDifference between hamiltonian of the isooctane model in the state to be confirmed and hamiltonian in the original state, β=1/k _B ，k _B Is Boltzmann constant, r is a random number, and r is more than or equal to 0 and less than or equal to 1.

Specifically, if the change in the state of spin causes the Hamiltonian amount of the system to decrease, i.e., ΔE < 0, then accepting a two-dimensional I Xin Moxing Ising change in this direction; if delta E is more than 0, judging randomly generated number r, r is more than or equal to 0 and less than or equal to 1, and Boltzmann factor E ^-βΔE Whether the formula is satisfied:

r＜e ^-βΔE

if this formula is satisfied, then a change in spintronic spin, T, is accepted _i+1 ＝αT _i Otherwise, spin remains unchanged and does not change in this direction. Wherein β=1/k _B ，T _i Is the current temperature of two-dimensional I Xin Moxing Ising, alpha is the learning rate, k _B Is the boltzmann constant.

Further, referring to fig. 1, in the step S600, the preset end condition is:

Specifically, the annealing process is repeated until the temperature T of the model falls to a prescribed certain value T _c At this time, the whole annealing algorithm is completed, and the state of each spin corresponding to the two-dimensional Ising Xin Moxing is obtained, namely the optimal solution ((S) of the optimization problem is obtained _best ＝{s ₁ ，...，s _N }))。

Referring to fig. 4, the invention further provides an acceleration quantum heuristic solving device based on an FPGA, configured to implement the acceleration quantum heuristic solving method based on the FPGA, where the acceleration quantum heuristic solving device based on the FPGA includes:

a state change calculation module 100, where the state change calculation module 100 includes a first hamiltonian calculation block 110, a first inverter 120, a second hamiltonian calculation block 130, and an adder 140, where first input ends of the first hamiltonian calculation block 110, the first inverter 120, and the adder 140 are sequentially connected, second hamiltonian calculation block 130 is connected with a second input end of the adder 140, and an input end of the first hamiltonian calculation block 110 is a spin state of a spin when the input end is in an original state, and an input end of the second hamiltonian calculation block 130 is a spin state of a spin to be confirmed;

a judging module 200, wherein the judging module 200 includes a negative value judging unit 210 and a random judging unit 220, and a first input end of the negative value judging unit 210 and a first input end of the random judging unit 220 are respectively connected with an output end of the adder 140;

the alternative module 300, the alternative module 300 includes a first or gate 310, a second inverter 320, a first and gate 330, a second and gate 340, a second or gate 350, and a trigger 360, where an output end of the negative value arbiter 210 is connected to a first input end of the first or gate 310, an output end of the random arbiter 220 is connected to a second input end of the first or gate 310, an output end of the first or gate 310 is connected to an input end of the second inverter 320, an output end of the second inverter 320 is connected to a second input end of the first and gate 330, a first input end of the first and gate 330 is connected to an input end of the first hamiltonian computation block 110, an output end of the first and gate 330 is connected to a first input end of the second or gate 350, an output end of the first or gate 310 is connected to a first input end of the second and gate 340, a second input end of the second and gate 340 is connected to a second input end of the second or gate 350, and a new state of the second or gate 360 is connected to an output end of the second or gate 360.

Further, the first hamiltonian calculation block 110 includes a first accumulator module, and the second hamiltonian calculation block 130 includes a second accumulator module.

The first accumulator module and the second accumulator module each include an ADDER32 module. The accumulation calculation is realized through an ADDER32 module.

Specifically, referring to FIG. 9, the parallelism of the FPGA is embodied in the computation E (SPIN), a computation formula

When in use, after the spin state S is read in, the design is realized, and 32J can be calculated simultaneously under 1 clock period _ij s _i s _j 32 h _i s _i The second clock cycle may begin by accumulating the product calculated from the previous clock cycle. An ADDER32 module was designed.

The ADDER32 module comprises a five-stage pipeline structure which is a first-stage sub-module, a second-stage sub-module, a third-stage sub-module, a fourth-stage sub-module and a fifth-stage sub-module respectively, wherein the first-stage sub-module comprises 16 ADDERs ADDER2, the second-stage sub-module comprises 8 ADDERs ADDER2, the third-stage sub-module comprises 4 ADDERs ADDER2, the fourth-stage sub-module comprises 2 ADDERs ADER 2, and the fifth-stage sub-module comprises 1 ADDER ADDER2. The ADDER ADDER2 is used for outputting two inputs after adding, each ADDER ADDER2 comprises two inputs and one output, and the ADDER ADDER2 further comprises at least one register for storing the result of the two-number addition.

Specifically, the first-stage submodule includes 16 ADDERs addr 2, including 32 input ends and 16 output ends, the 16 output ends are sequentially connected to the 16 input ends of the 8 ADDERs addr 2 of the second-stage submodule, the second-stage submodule includes 8 output ends, and is respectively connected to the 8 input ends of the 4 ADDERs addr 2 of the third-stage submodule, the third-stage submodule includes 4 output ends, is respectively connected to the 4 input ends of the 2 ADDERs addr 2 of the fourth-stage submodule, the fourth-stage submodule includes 2 output ends, is respectively connected to the 2 input ends of the 1 ADDERs addr 2 of the fifth-stage submodule, and the 1 output end of the fifth-stage submodule includes 1 output end, so that the sum of 32 addends can be obtained only by 5 clock cycles. And when data is pipelined into the ADDER32 module, the result will also be pipelined out.

Further, referring to fig. 4, the negative value determiner 210 includes a first input terminal, a second input terminal, and an output terminal, the first input terminal of the negative value determiner 210 is connected to the output terminal of the adder 140, the second input terminal of the negative value determiner 210 is connected to a zero value, and the output terminal of the negative value determiner 210 is connected to the first input terminal of the first or gate 310.

Specifically, referring to fig. 4, the sign of the negative value determiner 210 is < =, the second input terminal of the negative value determiner 210 is connected with 0, i.e., connected with low level, which means that if the difference between the hamiltonian amount of the two-dimensional isooctane model in the to-be-confirmed state and the hamiltonian amount in the original state is less than zero, the negative value determiner 210 outputs 1, i.e., high level.

Further, referring to fig. 4, the random arbiter 220 includes a lookup table 221, a random number generator 222, and a positive value arbiter 223, the output of the adder 140 is connected to a first input of the lookup table 221, the output of the lookup table 221 is connected to a first input of the positive value arbiter 223, the output of the random number generator 222 is connected to a second input of the positive value arbiter 223, and the output of the positive value arbiter 223 is connected to a second input of the first or gate 310.

Specifically, referring to FIG. 4, comparison e ^-βΔE With a random number r generated by a random number generator 222 (LFSR 32), if e ^-βΔE When r is greater than or equal to r, 1 (high level) is output, two comparison results are output to the first or gate 310, and when the output of the first or gate 310 is 1 (high level), that is, when any one of the two comparators is output to 1 (high level), the second and gate 340 is output to 0 (low level), and the first and gate 330 and the second and gate 340 are output to the second or gate 350 to perform an or operation, and are stored in the flip-flop 360. Referring to FIG. 2, the input end Old SPIN represents the state of the SPIN in the original state, the input end Old SPIN' represents the state to be confirmed, and the output end New SPIN represents the New state, in one particular embodimentThe new state is fed back to the Old SPIN for input to the first hamiltonian computation block 110 in the next clock cycle.

Further, referring to fig. 4, a clock input clk is further included, and a second input of the lookup table 221 and a second input of the flip-flop 360 are connected to the clock input clk, respectively. For simplicity and convenience of hardware design, the look-up table 221 and the flip-flop 360 employ a uniform clock signal clk, and in some embodiments, the second input terminal of the look-up table 221 and the second input terminal of the flip-flop 360 may be respectively connected to different clock input signals.

In addition, the random number generator 222 (LFSR 32) has an independent clock, and the clock of the random number generator 222 (LFSR 32) is not required to be consistent with the clock signal clk, so long as the second input terminal of the positive value determiner 223 can obtain a random number from the random number generator 222 (LFSR 32) as the input of the second input terminal of the positive value determiner 223 after the look-up table 221 obtains the result.

In some embodiments, the random number generator 222 may be implemented by other circuit modules, so long as it can output a random number time-varying.

In one embodiment, the selected development board model is ALINX-AX7Z100, see Table 1, whose main resources are as follows:

in the table 1 of the present invention,

Slice registers	LUT	DSP	LUTRAM
				554,800	277,400	2020	108,200

referring to fig. 4, the FPGA is a programmable logic gate array, which can be used to implement a custom hardware function, and has natural parallelism, so that the running speed of the two-dimensional i Xin Moxing Ising machine can be further accelerated. As spin states in the two-dimensional Ising Xin Moxing Ising are only {1, -1}, the difficulty of using FPGA to simulate the resource loss of the Ising machine is reduced.

In the implementation process of the algorithm, the steps without data dependency in the algorithm are operated simultaneously by utilizing the parallelism of the FPGA, for example, spin state update in two-dimensional Ising Xin Moxing Ising can be realized in parallel. For some modules that can be reused, the control circuitry is designed to achieve resource sharing, such as a random number generator 222 (LFSR 32).

Specifically, referring to fig. 5 and 6, the entire implementation is planned:

the first step is to implement an arithmetic module. In addition to basic addition and subtraction multiplication and division, pseudo-random numbers are also generated by FPGAs, the most common implementation being to use a linear feedback shift register LFSR, which typically consists of a shift register and an exclusive or logic gate. Referring to fig. 6, 32-bit Galois type linear feedback shift register LFSR taps are at 32, 22,2 and 1 bits. Meanwhile, the index is calculated, and the index can be calculated by a table look-up method.

The second step is to implement a logic module, because the main purpose of the FPGA is to improve the parallelism of implementing two-dimensional Ising Xin Moxing, so that on the premise of balancing the area and the speed, the FPGA is guaranteed to execute the structure adopting a state machine in sequence, and referring to FIG. 7, in order to further increase the parallelism of two-dimensional Ising Xin Moxing, the performance of two-dimensional Ising Xin Moxing is improved as much as possible by adopting a pipeline structure.

In a specific embodiment, the implementation of the whole algorithm on an FPGA comprises the steps of:

1. mapping the optimization problem into two-dimensional Ising Xin Moxing to obtain corresponding coefficient J _ij ，h _i . The random initialization problem is of a two-dimensional Ising Xin Moxing with a scale of N, and an initial state of N spin Ising is obtained:

S ₀ ＝{s ₁₀ ，s ₂₀ ，s ₃₀ ，...，s _N0}

at this time, hamiltonian amount in the model is set to E ₀ The dimension of the model is set as T ₀

2. In order to find the minimum Hamiltonian of the two-dimensional I Xin Moxing Ising, an annealing algorithm is adopted to ensure that spin states randomly change to obtain:

S′ ₀ ＝{s′ ₁₀ ，s′ ₂₀ ，s′ ₃₀ ，...，s′ _N0 }

the corresponding system Hamiltonian amount is E' ₀ Then calculate (E' ₀ -E ₀ ) If (3)Or r < e ^-βΔE The spin state of the new Ising model is updated to S ₁ ＝S′ ₀ ，T ₁ ＝αT ₀ . Otherwise, S ₁ ＝S ₁ ,T ₁ ＝T ₀ 。

3. Repeating the step 2 until the temperature of the two-dimensional I Xin Moxing Ising is less than T _c 。

Specifically, referring to fig. 8, since the FPGA has natural parallelism, if the algorithm is to be controlled to be executed in steps 1 to 3, a state machine structure is adopted, and the relationship between the state machine and the whole algorithm is as follows:

(1) The whole algorithm can be divided into initializing, running annealing algorithm, ending annealing algorithm (set the condition of stopping annealing is that the system temperature T is reduced to the specified T _c The value or the number of annealing iterations exceeds a prescribed certain value N _c ) Outputting the current optimal solution.

(2) The above stages are respectively coded as 00,01,10,11, and there are 4 states (00, 01,10, 11) in the state machine.

Specifically, the implementation of addition, subtraction, multiplication and division adopts a built-in digital signal processing DSP (digital signal processor) of an integrated design environment Vivado issued by an FPGA manufacturer's Sitting company to realize the function, and it is noted that division operation is avoided in the implementation process of the FPGA, namely, division by a certain number x is converted into multiplication by a reciprocal 1/x of a certain number, so that the basic operation actually realized is only addition, subtraction and multiplication.

Referring to fig. 10, by module sharing, when n=200, the operation speed is 20 to 30 times faster than that of the conventional CPU,

referring to table 2, the resources occupied by the quantum heuristic solver are as follows:

in the table 2 of the present application,

Slice Registers	LUT	DSP	LUTRAM
				0.26％	0.36％	0.59％	0.09％

the meaning of the abbreviations related to the invention:

FPGA Field Programmable Gate Array is a product developed further on the basis of programmable devices such as PAL (programmable array logic), GAL (general array logic) and the like. The programmable device is used as a semi-custom circuit in the field of Application Specific Integrated Circuits (ASICs), which not only solves the defect of custom circuits, but also overcomes the defect of limited gate circuits of the original programmable device.

MC: monte Carlo, also known as statistical modeling.

GPU: graphic Processing Unit, a graphics processor.

NPC: non-deterministic Polynomial time complete, non-deterministic polynomial time, refers to a problem that can be calculated with a Non-deterministic turing machine in polynomial time, another definition of equivalence is a problem whose solution accuracy can be checked in polynomial time.

LFSR32:32bit Linear-Feedback Shift Register (32 bit Linear feedback shift register), 32bit Linear feedback shift register.

LUT: look up table lookup table.

CLK: a clock signal.

DSP: digital Signal Processor, digital signal processor.

LUTRAM: i.e. Distributed RAM (random access memory), random access memory.

TTS: time To Solution, time consuming Time To solve the problem.

Definition of time consuming TTS to solve the problem:

wherein T is ₁ Representing the time of one iteration process of the run-anneal algorithm, P represents the probability of successfully finding the minimum value of the two-dimensional ifenprodil Xin Moxing Ising Hamiltonian in the run-anneal algorithm.

N: spin number of Ising i Xin Moxing, is also the scale (dimension) of the combinatorial optimization problem,

problem set naming:

maxcut problem: the Max-cut Problem (Max-cut Problem) refers to finding a maximum segmentation for a given directed weighted graph, maximizing the sum of weights across all edges of two cutsets.

maxcut_d3: density (density) equal to 3Maxcut max cut problem, J _ij =0 or 1, h _i ＝0

maxcut_dense: dense maxcut maximum cut problem (density)>3),J _ij =0 or 1, h _i =0 (non-zero coefficients are more than maxcut_d3).

sk problem (Sherrington-Kirkpartrick model), spin glass model:

mapping two-dimensional ifenesin Xin Moxing sk_ising in spin glass model: j (J) _ij ＝±1，h _i ＝0

When obeying the uniform distribution sk_uniform: j (J) _ij =0 or U (0, 1), h _i ＝0

sk_01：J _ij =0 or 1, h _i ＝0

Spin (the problem set generated by randomly taking the coefficients of spin (the coefficients are all randomly from U (0, 1)) directly on the basis of an Ising isooctane model), and spin two-dimensional I Xin Moxing spin_model obeys the following distribution:

J _ij ＝U(0，1)，h _i =u (0, 1) (U denotes uniform distribution), spin_model denotes tag of one dataset.

It should be appreciated that the method steps in embodiments of the present invention may be implemented or carried out by computer hardware, a combination of hardware and software, or by computer instructions stored in non-transitory computer-readable memory. The method may use standard programming techniques. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.

Furthermore, the operations of the processes described herein may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described herein may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications), by hardware, or combinations thereof, collectively executing on one or more processors. The computer program includes a plurality of instructions executable by one or more processors.

Further, the method may be implemented in any type of computing platform operatively connected to a suitable computing platform, including, but not limited to, a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and so forth. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optical read and/or write storage medium, RAM, ROM, etc., such that it is readable by a programmable computer, which when read by a computer, is operable to configure and operate the computer to perform the processes described herein. Further, the machine readable code, or portions thereof, may be transmitted over a wired or wireless network. When such media includes instructions or programs that, in conjunction with a microprocessor or other data processor, implement the steps described above, the invention described herein includes these and other different types of non-transitory computer-readable storage media. The invention may also include the computer itself when programmed according to the methods and techniques of the present invention.

The computer program can be applied to the input data to perform the functions described herein, thereby converting the input data to generate output data that is stored to the non-volatile memory. The output information may also be applied to one or more output devices such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including specific visual depictions of physical and tangible objects produced on a display.

The present invention is not limited to the above embodiments, but can be modified, equivalent, improved, etc. by the same means to achieve the technical effects of the present invention, which are included in the spirit and principle of the present invention. Various modifications and variations are possible in the technical solution and/or in the embodiments within the scope of the invention.

Claims

1. An acceleration quantum heuristic solving device based on an FPGA, which is characterized by comprising:

the state change calculation module (100), the state change calculation module (100) comprises a first hamiltonian amount calculation block (110), a first inverter (120), a second hamiltonian amount calculation block (130) and an adder (140), wherein the first hamiltonian amount calculation block (110), the first inverter (120) and a first input end of the adder (140) are sequentially connected, the second hamiltonian amount calculation block (130) is connected with a second input end of the adder (140), the input end of the first hamiltonian amount calculation block (110) is a spin state of a spin when in an original state, and the input end of the second hamiltonian amount calculation block (130) is a spin state of the spin to be confirmed;

the judging module (200), the judging module (200) comprises a negative value judging device (210) and a random judging device (220), the first input end of the negative value judging device (210) and the first input end of the random judging device (220) are respectively connected with the output end of the adder (140);

a second selection module (300), the second selection module (300) includes a first or gate (310), a second inverter (320), a first and gate (330), a second and gate (340), a second or gate (350) and a trigger (360), the output end of the negative value judgment module (210) is connected with the first input end of the first or gate (310), the output end of the random judgment module (220) is connected with the second input end of the first or gate (310), the output end of the first or gate (310) is connected with the input end of the second inverter (320), the output end of the second inverter (320) is connected with the second input end of the first and gate (330), the first input end of the first and gate (330) is connected with the input end of the first Hamiltonian calculation block (110), the output end of the first and gate (330) is connected with the first input end of the second or gate (350), the output end of the first or gate (310) is connected with the second input end of the second or gate (340), the output end of the second or gate (340) is connected with the second input end of the second and the trigger (360), the spin state of the spins when the output of the flip-flop (360) is in the new state.

2. The FPGA-based accelerated quantum heuristic solver of claim 1 wherein,

the first hamiltonian computation block (110) comprises a first accumulator module and the second hamiltonian computation block (130) comprises a second accumulator module.

3. The FPGA-based accelerated quantum heuristic solver of claim 1 wherein,

the negative value judging device (210) comprises a first input end, a second input end and an output end, the first input end of the negative value judging device (210) is connected with the output end of the adder (140), the second input end of the negative value judging device (210) is connected with a zero value, and the output end of the negative value judging device (210) is connected with the first input end of the first OR gate (310).

4. The FPGA-based accelerated quantum heuristic solver of claim 1 wherein,

the random judgment device (220) comprises a lookup table (221), a random number generator (222) and a positive value judgment device (223), wherein the output end of the adder (140) is connected with the first input end of the lookup table (221), the output end of the lookup table (221) is connected with the first input end of the positive value judgment device (223), the output end of the random number generator (222) is connected with the second input end of the positive value judgment device (223), and the output end of the positive value judgment device (223) is connected with the second input end of the first OR gate (310).

5. The FPGA-based accelerated quantum heuristic solver of claim 4 wherein,

and a clock input terminal, to which a second input terminal of the lookup table (221) and a second input terminal of the flip-flop (360) are connected, respectively.

6. An acceleration quantum heuristic solving method based on an FPGA and based on the acceleration quantum heuristic solving device based on an FPGA as claimed in any one of claims 1 to 5, comprising the steps of:

7. The FPGA-based accelerated quantum heuristic solution method of claim 6, wherein in step S200, the hamiltonian amount is:

wherein,,

8. The FPGA-based accelerated quantum heuristic solution of claim 6, wherein step S500 further comprises:

9. The FPGA-based accelerated quantum heuristic solution of claim 8, wherein in step S520, the decision condition is:

r＜e ^-βΔE ，

10. The FPGA-based accelerated quantum heuristic solution method of claim 6, wherein in step S600, the preset ending condition is: