CN111931939A - Single-amplitude quantum computation simulation method - Google Patents

Single-amplitude quantum computation simulation method Download PDF

Info

Publication number
CN111931939A
CN111931939A CN201910394102.9A CN201910394102A CN111931939A CN 111931939 A CN111931939 A CN 111931939A CN 201910394102 A CN201910394102 A CN 201910394102A CN 111931939 A CN111931939 A CN 111931939A
Authority
CN
China
Prior art keywords
quantum
vertex
edge
tensor
amplitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910394102.9A
Other languages
Chinese (zh)
Other versions
CN111931939B (en
Inventor
王晶
窦猛汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Origin Quantum Computing Technology Co Ltd
Original Assignee
Origin Quantum Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Origin Quantum Computing Technology Co Ltd filed Critical Origin Quantum Computing Technology Co Ltd
Priority to CN201910394102.9A priority Critical patent/CN111931939B/en
Publication of CN111931939A publication Critical patent/CN111931939A/en
Application granted granted Critical
Publication of CN111931939B publication Critical patent/CN111931939B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N10/00Quantum computing, i.e. information processing based on quantum-mechanical phenomena

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a single-amplitude quantum computation simulation method, which comprises the following steps: obtaining a target quantum program for each computing node of the distributed cluster; constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the operation of the quantum logic gate, and one edge of the undirected graph corresponds to a tensor; obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and matched with the GPU corresponding to the calculation node; wherein the sub-amplitude is an amplitude corresponding to the undirected graph; and returning the sub-amplitudes to the main node of the distributed cluster so that the main node reduces each sub-amplitude to obtain the amplitude of the quantum state as a target single amplitude. With the embodiments of the present invention, quantum computing simulations involving 50 or even more qubits can be achieved.

Description

Single-amplitude quantum computation simulation method
Technical Field
The invention belongs to the technical field of quantum computation, and particularly relates to a single-amplitude quantum computation simulation method.
Background
Quantum computers are physical devices that perform high-speed mathematical and logical operations, store and process quantum information in compliance with the laws of quantum mechanics. When a device processes and calculates quantum information and runs quantum algorithms, the device is a quantum computer.
Quantum computers can perform a variety of tasks that classical computers cannot accomplish, such as quantum simulation and factoring large figures. On the way of quantum computing, in order to realize 'quantum ownership', a quantum computer with quantum bit number of more than 50 and high fidelity is required to be realized. Before the realization of the method, the quantum computation simulation can be carried out through the related theory of the quantum computation to realize the software and hardware decoupling of the quantum computer, and the foundation is laid for the development of quantum programs and quantum applications.
The quantum computation simulation is a simulation computation which simulates and follows the law of quantum mechanics by means of numerical computation and computer science, and is used as a simulation program which describes the space-time evolution of quantum states by utilizing the high-speed computing capability of a computer according to the basic law of quantum bits of the quantum mechanics.
At present, the quantum computation simulation usually adopts full-amplitude simulation, that is, all amplitudes of the last state of a quantum bit are simulated at one time, but the full-amplitude simulation is computed based on unitary transformation, and the memory overhead of the full-amplitude simulation increases exponentially with the number of the quantum bits. For example, to simulate a quantum computation involving 30 qubits, the memory overhead is 16 gbytes (gigabytes); at 40 qubits, the memory overhead requires 16TByte (terabyte), i.e., 210(16 GByte); for 50 qubits, the memory overhead is 16 PBytes (beats), i.e. 21016 TByte. The simulation method is hard to bear for common cloud platforms and even super computing platforms which provide quantum computing simulation services, the academic world can only simulate 49 quantum bits at most by using a full-amplitude simulator at present, and the simulation result is based on the largest super computer in the world, but the cloud services are not provided externally, and the research and development of quantum programs and quantum applications are not facilitated. In this case, single-amplitude simulation, i.e. a scheme of simulating only one amplitude at a time, has been proposed, and the memory requirement of this mode will be much smaller. Therefore, it can be seen that, under the condition that the memory resource of the current platform is limited, the research and implementation related to the quantum computation simulation of the amplitude of the single quantum state component are particularly important for the development of quantum computation.
Disclosure of Invention
The invention aims to provide a single-amplitude quantum computation simulation method to solve the defects in the prior art and realize computation simulation involving 50 or more qubits.
The technical scheme adopted by the invention is as follows:
a single amplitude quantum computational simulation method, the method comprising:
obtaining a target quantum program for each computing node of the distributed cluster;
constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the operation of the quantum logic gate, and one edge of the undirected graph corresponds to a tensor;
obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and matched with the GPU corresponding to the calculation node; wherein the sub-amplitude is an amplitude corresponding to the undirected graph;
and returning the sub-amplitudes to the main node of the distributed cluster so that the main node reduces each sub-amplitude to obtain the amplitude of the quantum state as a target single amplitude.
Optionally, the constructing an undirected graph corresponding to the target quantum program includes:
analyzing the target quantum program to obtain a linked list for recording quantum program information;
traversing the linked list, and creating an edge with a tensor order of 1 when the type of the quantum logic gate in the linked list is a first single quantum gate; wherein the edge is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second single quantum gate, creating an edge with the tensor order of 2 and a vertex connected with the edge; the edge is connected with the last vertex of the corresponding vertex chain of the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, an edge with the tensor order of 2 is created; wherein the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the first dual-quantum gate, and the unitary matrix of the first dual-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double quantum gate, an edge with the tensor order of 4 and two vertexes connected with the edge are created; the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and obtaining an undirected graph corresponding to the target quantum program.
Optionally, the calculating, based on the quantum state and the undirected graph and in cooperation with the GPU corresponding to the calculation node, the sub-amplitude of the quantum state includes:
calling a GPU corresponding to the computing node, and respectively determining the tensors of edges connected with specific vertexes of the undirected graph to reduce the order; wherein the specific vertex is the first and last vertex of the vertex chain corresponding to each qubit;
deleting the particular vertex;
receiving a value of a target vertex allocated by the master node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction of tensors of connecting edges of the target vertex aiming at each sub-undirected graph obtained by splitting;
aiming at each vertex in the sub-undirected graph, combining the GPU to fuse all connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex;
taking product of tensor values of all the reduced new edges to obtain a first sub-amplitude of the quantum state corresponding to the sub-undirected graph;
and summing the first sub-amplitudes of all the sub-undirected graphs in the quantum state to obtain the sub-amplitude of the quantum state.
Optionally, the determining the value reduction of the tensors of the edges connected to the specific vertex of the undirected graph respectively includes:
the GPU corresponding to the computing node sets the number of thread blocks according to the reduced tensor order and the number of threads in each thread block in the GPU aiming at the edge connected with each specific vertex;
calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit;
determining a second element number with a preset determination value on the number position corresponding to the specific vertex position from the two second element numbers;
and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
Optionally, the receiving a value of a target vertex allocated by a master node, splitting the current undirected graph based on the value of the target vertex, and invoking the GPU to determine value reduction of a tensor of a target vertex connection edge for each sub-undirected graph obtained by splitting, includes:
receiving one or more values of the target vertex equally divided by the main node; wherein the target vertex is the first m vertices with the maximum number of connected edges in the current undirected graph, and the m vertices comprise 2mThe number of the calculation nodes is 2nN is a positive integer, and n is greater than 0 and less than or equal to m;
splitting an undirected graph of the computing node into one or more sub-undirected graphs aiming at each evenly-divided vertex value;
and traversing the edges connected with the target vertex aiming at each sub-undirected graph, and calling the GPU to respectively determine the tensors of the edges connected with the target vertex and reduce the order.
Optionally, the merging, by cooperating with the GPU, all the connection edges of the vertex into a new edge includes:
determining a first edge and a second edge to be fused aiming at all connecting edges of the vertex;
calling the GPU to perform upscaling on the first tensor of the first edge according to the vertex which is not connected with the first edge in the second edge, and updating the first tensor by the upscaled tensor;
deleting the second edge, and connecting the vertex of the second edge, which is not connected with the first edge, to the first edge to obtain a fused middle edge;
calling the GPU to calculate tensor elements of the middle edge according to the recorded corresponding relation between the vertex numbers of the first edge and the second edge;
and returning to the step of determining the first edge and the second edge to be fused until the tensor element obtained by calculation is the tensor element of the last edge, and determining the last edge as a new edge to be fused.
Optionally, the step-up of the first amount of the first edge includes:
the GPU calculates the tensor order after the order is increased according to the order of the first tensor and the increased order;
setting the number of thread blocks according to the tensor order after the upgrade and the number of threads in each thread block in the GPU;
calculating the first element number of the tensor after the upgrade according to the thread block serial number, the thread number in each thread block and the line program number;
the element of the tensor after the ascending order is calculated according to the first element number, the ascending order and the element of the first tensor.
Optionally, the calculating tensor elements of the middle edge includes:
the GPU sets the number of thread blocks according to the updated order of the first tensor and the number of threads in each thread block in the GPU;
calculating the first element number of the tensor of the middle edge according to the thread block serial number, the number of threads in each thread block and the line program number;
determining a corresponding element of each element in the first tensor in a second tensor of the second edge according to the corresponding relation;
traversing each element in the first tensor to update the element by its product with its corresponding element in the second tensor.
Optionally, the reducing the tensor of the new edge includes:
the GPU sets the number of thread blocks according to the tensor order of the new edge after order reduction;
calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit;
and acquiring two second element values corresponding to the two second element numbers one to one, summing the two second element values, and determining the sum as a first element value corresponding to the first element number.
Optionally, the calculation formula of the first element number is:
Idx=block_id*num+thread_id
the Idx is a first element number, block _ id is a thread block serial number, num is the number of threads in each thread block, and thread _ id is a thread program number.
Compared with the prior art, the method can calculate only one target single amplitude of the involved qubits at a time, specifically, a target quantum program is mapped onto an undirected graph, the undirected graph is split to a plurality of calculation nodes by combining a path integration method, and each calculation node and a subordinate GPU are matched to calculate the corresponding sub-undirected graph. The whole calculation process is mostly based on simple operation of elements in tensor, compared with full-amplitude simulation based on unitary matrix in the prior art, the requirement on the memory is greatly reduced, and the calculation amount does not rise along with the index of the quantum bit, so that quantum calculation simulation related to 50 or more quantum bits can be realized; the GPU has stronger performance of executing massive parallel computation, so that the quantum computation simulation efficiency is higher. At present, quantum computation simulation involving 196 qubits can be realized at most by applying the technical solution provided by the embodiments of the present invention.
In addition, in practical application, sometimes only one or more amplitudes in the full amplitude of the qubits are needed, and in this case, if the full amplitude mode in the prior art is adopted, that is, all the amplitudes are simulated at one time, the waste of resources such as a memory and time is undoubtedly caused; by applying the method provided by the embodiment of the invention, one or more times of simulation can be performed in a targeted manner, and one or more single amplitudes required can be simulated, so that resources and time are greatly saved.
Drawings
FIG. 1 is a specific example of the splitting of quantum wires into different paths for a quantum program of the present invention;
FIG. 2 is a flow chart of a single amplitude quantum computation simulation method according to an embodiment of the present invention;
fig. 3 is a schematic undirected graph diagram of different types of quantum logic gates constructed in the single-amplitude quantum computation simulation method according to the embodiment of the present invention.
Detailed Description
The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.
In order to realize computational simulation involving 50 or more qubits, embodiments of the present invention provide a single-amplitude quantum computational simulation method and apparatus.
First, a single-amplitude quantum computation simulation method provided by the embodiment of the present invention is described below.
As will be appreciated by those skilled in the art, each qubit may be at |0 simultaneously>And |1>The quantum state ψ of one qubit can be represented as a |0>+b|1>Wherein a and b are respectively |0>、|1>The amplitudes of (A) are all complex numbers. After measurement, the quantum state collapses to a fixed quantum state, where it collapses to |0>Has a probability of2Collapse to |1>Has a probability of b2,a2+b 21. And the quantum state of n qubits is 2nA superposition of individual quantum states. For example, the quantum state ψ of a 3 qubit composition is 23(i.e., 8) superposition states of quantum states, wherein the 8 quantum states are each |000>、|001>、|010>、|011>、|100>、|101>、|110>And |111>At this time, the quantum state ψ of the 3 qubit composition can be expressed as
ψ=c0|000>+c1|001>+c2|010>+c3|011>+c4|100>+c5|101>+c6|110>+c7|111>。
Wherein each of the 8 quantum states is referred to as a quantum state component, and each quantum state component has an amplitude, i.e., c0To c7These complex numbers may be referred to as a single amplitude. Full-amplitude simulation refers to one-time simulation of 2 of n qubitsnThe amplitude of the individual quantum state components; the single-amplitude simulation is a one-time simulation 2nAmplitude of any one quantum state component of the individual quantum states.
Currently, full amplitude mode is mostly adopted in the industry for quantum computation simulation. However, for the full-amplitude mode, the memory occupied by the full-amplitude mode generally increases exponentially with the number of analog qubits, for example, only 16KB memory is needed for simulating 10 qubits, 16MB memory is needed for simulating 20 qubits, 16GB is needed for simulating 30 qubits, and up to 16PB memory is needed for simulating 50 qubits, and all computer memories in the integrated world cannot realize full-amplitude simulation of 50 qubits or even more.
In view of this, the embodiment of the present invention provides a single-amplitude quantum computation simulation method, which is applied to compute nodes of a distributed cluster, where all the compute nodes correspond to one master node, each compute node belongs to (i.e., controls) one or more GPUs, and the number of GPUs controlled by one compute node may be a positive integer power of 2, so that it is convenient to process quantum information because the number of quantum states of a qubit is an index of 2. The following description will take an example in which one compute node controls one GPU. Preferably, the distributed cluster may be a supercomputer cluster (e.g. the optical supercomputer platform of the Shenwei Taihu lake).
It should be noted that the quantum program is a string of instruction sequences written by a quantum language and capable of running on a quantum computer, so that the support of the operation of the quantum logic gate is realized, and the simulation of the quantum computation is finally realized. In particular, a quantum program is a sequence of instructions that operate quantum logic gates in a time sequence.
A quantum logic gate is a basic quantum wire that operates on a small number of qubits (i.e., qubits). It is the basis for quantum wires, like the relationship between conventional logic gates and conventional digital wires. The quantum logic gate comprises a single quantum logic gate, a double quantum logic gate and a multiple quantum logic gate.
A quantum program may contain tens or hundreds of quantum logic gate operations, or may contain thousands or millions of quantum logic gate operations. The execution process of the quantum program is a process executed for all the quantum logic gates according to a certain time sequence. The timing is a time sequence in which the quantum logic gates are executed.
It should be noted that quantum logic gates are generally represented by unitary matrices, and unitary matrices are not only matrices, but also operations and transformations. The function of a general quantum logic gate on a quantum state is calculated by multiplying the unitary matrix by the right vector corresponding to the quantum state.
For example: quantum state |0>Corresponding right vector is
Figure BDA0002057592260000071
And quantum state |1>Corresponding right vector is
Figure BDA0002057592260000072
The quantum logic gates can be further divided into diagonal quantum logic gates and non-diagonal quantum logic gates according to the unitary matrix type. The diagonal quantum logic gate refers to a quantum logic gate with a unitary matrix being a diagonal matrix. As is known, a diagonal matrix is a matrix whose elements outside the main diagonal are all 0, and the elements on the diagonal can be 0 or other values.
Such as an identity matrix
Figure BDA0002057592260000081
A typical diagonal matrix.
In contrast, there are non-zeros outside the main diagonalThe matrix of elements being a non-diagonal matrix, e.g.
Figure BDA0002057592260000082
And the unitary matrix is a non-diagonal matrix quantum logic gate, namely a non-diagonal quantum logic gate.
It will be understood by those skilled in the art that the initial quantum state |0.. 0 of the qubit is assumed to be divided in the target quantum program>And the last state is involved in M1Quantum state, then, since the state of each qubit can be at |0>And |1>So as to be directed to M1One quantum state of the population is split into two quantum state components: i0>And |1>Then, the initial quantum state to the final state component X ═ X can be obtained0...xn-1>2 of (2)M1Calculating the amplitude of each path by using possible transformation paths, and summing to obtain the final state component, namely the amplitude of the target quantum state component. Wherein M is1Is a positive integer.
For example, a target quantum process involves 2 qubits, respectively: q. q.s0、q1Initial state s0=|00>The target quantum state component is |11>The quantum program includes 2H gates (Hadamard Gate ): h1H 21 CNOT Gate (Control-not Gate).
As shown in FIG. 1a, which gives a simple illustration of the quantum process with respect to a quantum wire, it can be seen that the quantum wire divides the initial quantum state |00>And a target quantum state component |11>In addition, 4 quantum states are involved: s0 1、s0 2、s1 1、s1 2Where each quantum state can be represented as |0>And |1>In the stacked state. If will s1 1Splitting into |0>And |1>Two parts, then, from the initial quantum state |00>To |11>It can be split into two paths as shown in (1b) and (1c) to obtain the initial quantum state |00>Transformed into |11 via two paths>And summing the sub-amplitudes to obtain an amplitude value corresponding to the target quantum program, thereby completing the simulation.
It will be appreciated that if M in a target quantum program is to be used1Quantum state, all splitting it into |0>And |1>Two parts, then the initial quantum state to the final state component is obtained
Figure BDA0002057592260000083
And calculating the amplitude of each path by using the possible transformation paths, and summing to obtain the target single amplitude of the quantum state.
In a quantum program with only single-quantum logic gates and diagonal dual-quantum logic gates, the initial quantum state of a given qubit is |0.. 0>When the final state component, i.e. the target quantum state component, takes the value of x ═ x0...xn-1>Then, the calculation formula of the amplitude can be expressed as:
Figure BDA0002057592260000091
formula (1) is a basic formula of the quantum mechanical path integration method.
It should be noted that psi functions in formula (1) are all complex functions related to boolean variables, and represent the contribution of quantum logic gates to quantum states, and for better illustration, formula (1) only represents one of the three types of psi functions, and omits other psi functions;
Figure BDA0002057592260000092
the quantum bit with the value of {0, 1}, corresponding to the quantum bit j, is subjected to the action of the kth quantum logic gate to obtain the quantum state component. The value of psi function is mainly related to two factors, namely, the quantum state of quantum bit operated by the quantum logic gate before and after the quantum logic gate is executed, and unitary matrix of the quantum logic gate.
In particular, the amount of the solvent to be used,
Figure BDA0002057592260000093
is about a Boolean variable
Figure BDA0002057592260000094
And
Figure BDA0002057592260000095
the value of the complex function is determined by the values of two variables and the unitary matrix of the corresponding diagonal dual-quantum logic gate,
Figure BDA0002057592260000096
and
Figure BDA0002057592260000097
respectively corresponding to qubits of v1And v2The two quantum bits are not subjected to the component of the quantum state before the action of the v diagonal double-quantum logic gate;
Figure BDA0002057592260000098
is about a Boolean variable
Figure BDA0002057592260000099
The value of the complex function is determined by the value of the variable and the unitary matrix corresponding to the diagonal single-quantum logic gate,
Figure BDA00020575922600000910
the quantum bit corresponding to the quantum bit u is not subjected to the component of the quantum state before the action of the u' th diagonal single-quantum logic gate;
Figure BDA00020575922600000911
is about a Boolean variable
Figure BDA00020575922600000912
And
Figure BDA00020575922600000913
the value of the complex function is determined by the values of two variables and a unitary matrix corresponding to the non-diagonal single-quantum logic gate,
Figure BDA00020575922600000914
and
Figure BDA00020575922600000915
respectively correspond to quantityThe quantum bit with the sub-bit j has the components of the front and back quantum states before and after the action of the ith off-diagonal single-quantum logic gate. It is understood that j, k, v1、v2、v1′、v2', u', i are all non-negative integers.
The single-amplitude quantum computation simulation method provided by the embodiment of the invention is expanded to an off-diagonal double-quantum logic gate based on a formula (1), and is mapped into an undirected graph. In particular, the amount of the solvent to be used,
Figure BDA0002057592260000101
the ψ function corresponds to the edge of the undirected graph, and converts the solving formula (1) into processing for the undirected graph, as will be understood from the points of the undirected graph
Figure BDA0002057592260000102
The value of (2) can split the undirected graph; more specifically, an undirected graph corresponding to a target quantum program is constructed on a plurality of computing nodes, the undirected graph is split according to different vertex values to obtain different sub-undirected graphs, amplitudes of corresponding paths are obtained through the calculation of the sub-undirected graphs, and finally the amplitudes of all the paths are combined to obtain corresponding amplitudes of target quantum state components.
As shown in fig. 2, a single-amplitude quantum computation simulation method provided in an embodiment of the present invention may include the following steps:
s201, aiming at each computing node of the distributed cluster, obtaining a target quantum program;
from the perspective of the computing nodes, each computing node obtains an identical target quantum program sent by the master node, the program is preferably written in the existing QRunes language, and may also be written in other feasible quantum languages, the computing node is usually a CPU in a computer, the master node may also be a CPU, and the first computing node may be used as the master node, which can take the quantum program input by a user, and the computing node is hereinafter described as the CPU.
S202, constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the operation of the quantum logic gate, and one edge of the undirected graph corresponds to a tensor;
first, the target quantum program may be parsed to obtain a linked list of recorded quantum program information. And traversing the linked list, sequentially reading the types and unitary matrix forms of all the quantum logic gates in the linked list, adding vertexes and edges, and constructing an undirected graph of the quantum program. One specific implementation is as follows:
when the type of the quantum logic gate in the linked list is a first single quantum gate, creating an edge with a tensor order of 1; wherein the edge is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second single quantum gate, creating an edge with the tensor order of 2 and a vertex connected with the edge; the edge is connected with the last vertex of the corresponding vertex chain of the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, an edge with the tensor order of 2 is created; wherein the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the first dual-quantum gate, and the unitary matrix of the first dual-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double quantum gate, an edge with the tensor order of 4 and two vertexes connected with the edge are created; the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and finishing traversing, and finishing adding vertexes and edges to obtain an undirected graph corresponding to the target quantum program.
It should be noted that, in the process of constructing an undirected graph, when a vertex is created, the vertex is recorded as belonging to the next vertex of the qubit operated by the current quantum logic gate.
In the constructed undirected graph, each quantum bit corresponds to vertex chain information, and the vertex chain information comprises vertex values from a first vertex to a last vertex, vertex connection side information and vertex identification. The vertex identification uniquely determines a vertex, and the quantum bit to which the corresponding vertex belongs, the value of the quantum bit, the connection side information and the like can be determined according to the identification; after the value of the vertex is determined, the value is 0 or 1; however, when the vertex value is uncertain, the vertex value may be null, or any agreed numerical value or character that conforms to the vertex value type, such as-1, for determining the vertex value condition in the implementation process. The vertex values may be expressed as tensors, or may be expressed as variables or other reasonable data types.
The undirected graph also includes tensor information for the edges, which may include an array of tensors and an identification of vertices connected by the edges corresponding to the tensors.
Wherein, the vertex of the undirected graph corresponds to the quantum state component of the operated qubit before or after the operation of the quantum logic gate, and the values are all {0, 1}, corresponding to the variable in the formula (1)
Figure BDA0002057592260000111
And the operated qubit is the qubit corresponding to the operation of the quantum logic gate. The edges of the undirected graph correspond to quantum logic gates in the target quantum program, specifically, the edges corresponding to each quantum logic gate correspond to a tensor, elements in the tensor are determined by unitary matrixes corresponding to the quantum logic gates and vertex values connected with the corresponding edges, and it can be understood that the tensor corresponds to the ψ function in formula (1).
Tensor (Tensor) is a quantity defined in several linear spaces simultaneously, and is a generalization of the concept of vectors and matrices. Each tensor can be indexed using subscript notation, e.g., tensor T12The number of subscripts is the order of the tensor (rank), which represents the dimension of the tensor. For example, the scalar is a 0 th order tensor, the vector is a 1 st order tensor, and the matrix is a 2 nd order tensor. The shape of the tensor is then the number of elements in each dimension; the number of tensor elements is determined by their shape.
For example, tensor B is known1234The subscript is "1234", which is a 4 th order tensor; wherein, if there are 3 elements in the dimension represented by subscript 1, 4 elements in the dimension represented by subscript 2, 5 elements in the dimension represented by subscript 3, and 4 elements in the dimension represented by subscript 4, tensor B is represented1234Shape of [3, 4, 5, 6 ]]The number of elements is: 3 × 4 × 5 × 6 is 360.
In the embodiment of the invention, the order of the tensor is equal to the number of vertexes connected with corresponding edges of the tensor, and the subscript of the tensor is the number of vertexes in the undirected graph. Since each vertex can only take a value of 0 or 1, there are only two possibilities, and thus, for an n-order tensor, its shape is [2, 2, 2.. 2 ]]The number of elements is 2n
The number of each tensor element can be represented as a binary number, and each bit of the binary number can be represented as a value of a corresponding vertex.
For example, edge EmConnecting 4 vertices, the corresponding tensor is a 4 th order tensor with 2 total elements4The element number can be represented as a binary number: (0000)2~(1111)2. When edge EmThe 4 connected vertexes are arranged according to a certain sequence to obtain a vertex sequence, and when the four vertexes all take values of 0, the combined value of the vertex sequence is 0000, and the (0000) th corresponding tensor is obtained2A bit element; when edge EmThe value of the first vertex of the connection is 1, the value of the second vertex is 0, the value of the third vertex is 0, and the value of the fourth vertex is 1, the combined value of the vertex sequences is '1001', corresponding to the tensor (1001)2A bit element.
In practical applications, when the qubit logic gate is a single-quantum logic gate or a dual-quantum logic gate and the unitary matrix corresponding to the qubit logic gate is a diagonal matrix, i.e., when the diagonal single-quantum logic gate or the diagonal dual-quantum logic gate acts on the qubit, the qubit logic gate usually only changes the amplitude, and the quantum state component corresponding to the qubit corresponds to the quantum state component in the formula (1)
Figure BDA0002057592260000121
There is typically no change.
Single quantum logic gates of this type, typically Pauli-Z gates, have a unitary matrix of
Figure BDA0002057592260000122
When a qubit is operated on with a Poly-Z gate, the basic state |0> of the qubit is left unchanged and |1> is converted to- |1 >.
A dual quantum logic gate of this type, typically a CZ gate, has a unitary matrix of:
Figure BDA0002057592260000123
when for two qubits Q0、Q1(Q0To control bits, Q1Target bit) when performing a CZ gate operation, Q is the same as the target bit0Quantum state of (b) is |0>When is, Q1The quantum state of (a) is unchanged; when Q is0Quantum state of |1>When is, Q1Quantum state retention of 0>Invariably, will |1>Change to- |1>。
It can be seen that either the pauli-Z gate or the CZ gate brings about only a change in the amplitude of the quantum state component, with the quantum state component being unchanged. Therefore, when an undirected graph is constructed for such a quantum logic gate, i.e., a single quantum logic gate or a double quantum logic gate in which a unitary matrix is a diagonal matrix, an edge corresponding to the quantum logic gate is added to the undirected graph.
As shown in FIG. 3a, for diagonal single quantum logic gate, only one edge E is added when constructing an undirected graph1That is, one end of the edge and the vertex V0Are connected. Wherein, V0The current last vertex of the corresponding quantum bit, namely the vertex corresponding to the quantum state directly acted by the quantum logic gate.
As can be appreciated, edge E1Only one vertex is connected, so its tensor
Figure BDA0002057592260000136
Is tensor of order 1 and has a total of 212 elements, the tensor corresponds to ψ in equation (1)uA function representing the contribution of the diagonal single quantum logic gate to a quantum state. In particular, it is assumed that the unitary matrix of the diagonal single-quantum logic gate
Figure BDA0002057592260000131
Then
Figure BDA0002057592260000132
Wherein, the vertex V0When the value is "0", it corresponds to
Figure BDA0002057592260000133
To (0)2Bit elements, i.e. U00,V0When the value is "1", the number (1) of the corresponding tensor is2Bit elements, i.e. U11
For example, when the edge E1When the corresponding edge of the diagonal single quantum logic gate Pagli-Z gate is known, the tensor can be known according to the unitary matrix of the Pagli-Z gate
Figure BDA0002057592260000134
As shown in FIG. 3b, for diagonal double-quantum logic gate, only one edge E needs to be added when constructing an undirected graph12One end of the edge and the vertex V1Connected with one end of the vertex V2Are connected. Wherein, V1And V2The two qubits of the diagonal double-quantum logic gate operation are respectively corresponding to the current last vertex, i.e. the vertex corresponding to the quantum state directly acted by the diagonal double-quantum logic gate.
It will be appreciated that edge E, as shown in FIG. 3b12With two vertices V1And V2Are connected so their tensors
Figure BDA0002057592260000135
Is 2-order tensor and has a total of 224 elements, the tensor corresponds to formula (1)ψvA function representing the contribution of the diagonal dual quantum logic gate to a quantum state. In particular, it is assumed that the unitary matrix of the diagonal biquantum logic gate
Figure BDA0002057592260000141
Then
Figure BDA0002057592260000142
Wherein when V1V2When the value is "00", it corresponds to
Figure BDA0002057592260000143
No. 002A bit element; v1V2When the value is "01", it corresponds to
Figure BDA0002057592260000144
To (01)2A bit element; when V is1V2When the value is "10", it corresponds to
Figure BDA0002057592260000145
Of (10)2A bit element; v1V2When the value is "11", it corresponds to
Figure BDA0002057592260000146
No. 11)2A bit element. Of course, the vertex sequence "V" may be followed2V1"corresponds to the elements in the tensor, and, in this order, it is understood that,
Figure BDA0002057592260000147
it should be noted that one value of the vertex sequence uniquely determines one element in the tensor, but the positional relationship between the value of the vertex sequence and the corresponding element in the tensor is not unique and can be determined according to actual requirements.
For example, when the edge E12When the corresponding edge of the diagonal biquantum logic gate CZ gate is obtained, the tensor can be known according to the unitary matrix
Figure BDA0002057592260000148
When the quantum logic gate is a single-quantum logic gate and the unitary matrix corresponding thereto is an off-diagonal matrix, i.e., an off-diagonal single-quantum logic gate, such as an H-gate, the unitary matrix is
Figure BDA0002057592260000149
Through the operation of the H gate, |0>Will become into
Figure BDA00020575922600001410
|1>Will become into
Figure BDA00020575922600001411
Both the amplitude and the quantum state components are changed. For the quantum logic gate, when an undirected graph is constructed, a vertex corresponding to a new quantum state and an edge corresponding to a quantum logic gate after a single quantum logic gate operation need to be added to the undirected graph.
As shown in FIG. 3c, for the non-diagonal single quantum logic gate, when constructing an undirected graph, a vertex V is added4An edge E34One end of the edge and the vertex V3Connected with one end of the vertex V4Are connected. Wherein, V3The current last vertex corresponding to the qubit operated by the off-diagonal single-quantum logic gate, namely the vertex corresponding to the quantum state directly acted by the off-diagonal quantum logic gate; v4And the new vertex corresponds to a new quantum state after the operation of the off-diagonal single-quantum logic gate.
As can be appreciated, edge E34With two vertices V3And V4Are connected so their tensors
Figure BDA00020575922600001412
Is 2-order tensor and has a total of 224 elements, the tensor corresponds to ψ in equation (1)iA function representing the contribution of the off-diagonal single quantum logic gate to a quantum state. In particular, it is assumed that the unitary matrix of the diagonal single-quantum logic gate
Figure BDA0002057592260000151
Then
Figure BDA0002057592260000152
Wherein when V3V4When the value is '00', the
Figure BDA0002057592260000153
No. 002A bit element; v3V4When the value is "01", it corresponds to
Figure BDA0002057592260000154
To (01)2A bit element; when V is3V4When the value is "10", it corresponds to
Figure BDA0002057592260000155
Of (10)2A bit element; v3V4When the value is "11", it corresponds to
Figure BDA0002057592260000156
No. 11)2A bit element.
For example, when the edge E34When the corresponding edge of the H gate of the non-diagonal single quantum logic gate is known according to the unitary matrix, tensor
Figure BDA0002057592260000157
When the quantum logic gate is a dual-quantum logic gate and the corresponding unitary matrix is an off-diagonal matrix, i.e., an off-diagonal dual-quantum logic gate, such as a CNOT gate, the unitary matrix is
Figure BDA0002057592260000158
The quantum state of the control bit is |0 by operation of the CNOT gate>The quantum state of the steered bit is unchanged, i.e., the quantum states of the steering bit and the steered bit are |00>And |01>Then, after CNOT gate operation, still respectively being |00>And |01>(ii) a The quantum state of the control bit is |1>Then the quantum state of the controlled bit is negated, i.e., 0 is applied>Becomes |1>Will |1>Becomes |0>I.e. the quantum states of the control bit and the steered bit are |10>And |11>When operated by CNOT gate, become respectively non-conducting11>And |10>. Since two qubits are usually at |00>、|01>、|10>And |11>So that after the CNOT gate operation, it can be seen that both the amplitude and the quantum state components of the two qubits involved will change. For such quantum logic gates, when an undirected graph is constructed, for each qubit, a vertex corresponding to a new quantum state of the undirected graph after the operation of the dual-quantum logic gate, that is, two new vertices, is added to the undirected graph, and a corresponding edge of the dual-quantum logic gate is added to the undirected graph.
As shown in fig. 3d, for the non-diagonal biquantum logic gate, when constructing the undirected graph, an edge E needs to be added5678For the sake of clarity, two edges are shown here, but it should be noted that the two edges in fig. 3d are actually the same edge E5678. Newly added edge end and vertex V5、V7Connected with one end of the vertex V6、V8Are connected. Wherein, V5And V6Respectively for the current last vertex, V, corresponding to the two qubits of the off-diagonal dual-quantum logic gate operation7And V8Respectively corresponding to the new quantum state of the two quantum bits after the operation of the off-diagonal double-quantum logic gate. In particular, when the non-diagonal double-quantum logic gate is a control logic gate, V5、V7Corresponding to the control bit, V6、V8Corresponding to the controlled bits.
It will be appreciated that edge E, as shown in FIG. 3d5678And four vertices V5、V6、V7And V8Are connected so their tensors
Figure BDA0002057592260000161
Is 4-order tensor and has a total of 24Let the vertex sequence "V" be 16 elements5V6V7V8The values (0000 to 1111) correspond to
Figure BDA0002057592260000162
Middle (0000)2~(1111)2A bit element. In particular, assume a unitary matrix of the non-diagonal biquantum logic gate
Figure BDA0002057592260000163
Wherein, when i ≠ j, UijNot all 0, then the vertex sequence "V5V6V7V8Value and U4The corresponding relationship between the elements in (1) is shown in table 1:
TABLE 1V5V6V7V8Value and U4The corresponding relation of each element in
Figure BDA0002057592260000164
In a clear view of the above, it is known that,
Figure BDA0002057592260000165
for example, when the edge E34When the corresponding edge of the non-diagonal single-quantum logic gate CNOT gate is known, its unitary matrix is
Figure BDA0002057592260000166
Thus, its tensor
Figure BDA0002057592260000167
It will be understood by those skilled in the art that any multi-qubit gate can be constructed with a single-quantum logic gate plus any double-quantum logic gate, which in most cases is a multi-select CNOT gate. In a sense, the CNOT gate and the single quantum logic gate are prototypes of all other gates. Therefore, the method provided by the application is also suitable for the quantum program with the multi-quantum bit gate, and in practical application, the multi-quantum bit gate in the quantum program can be converted into the combination of the single-quantum logic gate and the double-quantum logic gate, and then the single-amplitude quantum computation simulation method provided by the invention is applied.
S203, obtaining a quantum state corresponding to the target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and by matching with the GPU corresponding to the calculation node; wherein the sub-amplitude is an amplitude corresponding to the undirected graph;
in practical applications, when the simulation of quantum computation involves many qubits, the computation can be performed directly using the dirac symbol>It would be very inconvenient to incorporate a binary representation method to represent each quantum state. Therefore, it is usually expressed by decimal numbers corresponding to binary expression method, such as |000>I.e. zero state, |0100>This is the 4-state. It will be appreciated that if the target quantum state component is a decimal number, it will need to be converted by the master node into a binary string and then sent to each compute node. Each bit in the binary string corresponds to a value of a qubit, and the low order to the high order corresponds to the qubits from the low order to the high order, it should be noted that, for a quantum computer, the arrangement of the low order and the high order is arranged from the low order to the high order from right to left as in a classical computer. For example, assuming that the target quantum state component is 5 states and converted to binary form, i.e. "101", corresponding to 3 qubits (q in order from low to high:)0、q1、q2) Then "101" corresponds to "q", respectively2q1q0”。
Specifically, calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and by cooperating with the GPU corresponding to the calculation node may include:
calling a GPU corresponding to the computing node, and respectively determining the tensors of edges connected with specific vertexes of the undirected graph to reduce the order; wherein the specific vertex is the first and last vertex of the vertex chain corresponding to each qubit; deleting the particular vertex; receiving a value of a target vertex distributed by a main node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction of tensors of connecting edges of the target vertex aiming at each sub-undirected graph obtained by splitting; aiming at each vertex in the sub-undirected graph, combining the GPU to fuse all connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex; taking product of tensor values of all the reduced new edges to obtain a first sub-amplitude of the quantum state corresponding to the sub-undirected graph; and summing the first sub-amplitudes of all the sub-undirected graphs in the computing node to obtain the sub-amplitudes of the quantum states.
Finally, all vertices in each sub-undirected graph are deleted, leaving only edges with tensor order 0, the 0-order tensor being a scalar. The tensor values of the edges are multiplied to obtain a first sub-amplitude of the quantum state in the computational node.
For example, for a sub-undirected graph, all vertices and their connecting edges are S12、S16、S25、S3、S4
For vertex 1, S12、S16Fuse into a new edge S126For its tensor A126Reduced to A26The corresponding edge becomes S26 Delete vertex 1;
for vertex 2, the current connecting edge S25、S26Fuse into a new edge S256For its tensor A256Reduced to A56Corresponding side S56Delete vertex 2;
for vertex 3, only edge S is connected3For its tensor A3The order is reduced to A ═ x3(scalar quantity) for changing the corresponding edge to the edge s with tensor order 03Delete vertex 3;
for vertex 4, only edge S is connected4For its tensor A4Reduced to A' ═ x4(scalar quantity) for changing the corresponding edge to the edge s with tensor order 04Delete vertex 4;
for vertex 5, currently only edge S is connected56The tensor A is unchanged after fusion56Reduced to A6Corresponding side S6Delete vertex 5;
for vertex 6, currently only edge S is connected6For its tensor A6The order is reduced to A ″ ═ x6(scalar quantity) for changing the corresponding edge to the edge s with tensor order 06Vertex 6 is deleted.
Calculating a first sub-amplitude x corresponding to the sub-undirected graph3*x4*x6
For another example, for a sub-undirected graph, all vertices and their connecting edges are S123、S124、S15、S46
For vertex 1, S123、S124、S15Fuse into a new edge S12345For its tensor A12345Reduced to A2345The corresponding edge becomes S2345 Delete vertex 1;
for vertex 2, currently only edge S is connected2345For its tensor A2345Reduced to A345Corresponding side S345Delete vertex 2;
for vertex 3, only edge S is connected345For its tensor A345Reduced to A45Corresponding side S45Delete vertex 3;
for vertex 4, the current connecting edge S45、S46Fuse into a new edge S456For its tensor A456Reduced to A56Corresponding side S56Delete vertex 4;
for vertex 5, currently only edge S is connected56For its tensor A56Reduced to A6Corresponding side S6Delete vertex 5;
for vertex 6, currently only edge S is connected6For its tensor A6Reduced by B ═ y6(scalar quantity) for changing the corresponding edge to the edge s with tensor order 06Vertex 6 is deleted.
Calculating the first sub-amplitude corresponding to the sub-undirected graph as y6
Specifically, in order to reduce the calculation amount of the subsequent undirected graph, the tensors of the edges connected to the specific vertices of the undirected graph are respectively subjected to deterministic value reduction, which may be:
calculating a GPU corresponding to the nodes, and setting the number of thread blocks according to the tensor order after the order reduction and the number of threads in each thread block in the GPU aiming at the edges connected with each specific vertex; calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit; determining a second element number with a preset determination value on the number position corresponding to the specific vertex position from the two second element numbers; and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
Wherein, the calculation formula (the same below) of the first element number is:
Idx=block_id*num+thread_id (2)
idx is the first element number, block _ id is the thread block sequence number, num is the number of threads in each thread block, and thread _ id is the thread program number.
Standing at the angle of the GPU, a specific implementation mode for determining the value reduction is as follows: first, the elements of the tensor can be complex numbers, and since the GPU does not have (the CPU has) a representation of complex number x + yi, a new real part tensor space and imaginary part tensor space need to be applied, where the real part tensor space is designated to store x and the imaginary part tensor space to store y, so as to represent the tensor.
Obtaining the reduced order n sent by the computing node (CPU), and judging 2nWhether the number of the GPU thread blocks is less than num, if the number of the GPU thread blocks is less than 1, setting the number of the GPU thread blocks to be 2nAnd num, wherein the num value satisfies the power of a non-negative integer of 2. Due to the nature of the computer, the thread block number block _ id and the thread program number thread _ id are often numbered from 0.
Taking the GPU as GTX1080ti as an example, num of GTX1080ti is 1024, thread _ id is 0, 1, and 2 … … 1023. For example, edge E is connected to vertex 2 (vertex 2 for short)13254Tensor A of 5 th order13254Performing definite value reduction, i.e. for A13254Is determined, wherein the dimensions are numbered from right to left, e.g., vertex 1 is dimension 5 and vertex 4 is dimension 1.
Assuming that the determination value is 0 (or 1, which may be set as needed), it means that the vertex 2 takes a value of 0. A step-down operation is usuallyOne step down, so n is 4, 2nLess than 1024, thread block 1, blcok _ id 0. Reduced 4-order tensor A1354Only 16 elements, starting with thread 0, one thread calculates an element number, only thread 0-15 is called, and the first element number calculation can be: idx is 0, 1, 2 … … 15.
For the first element number 10 (binary 1010), the two second element numbers before the corresponding reduction are calculated according to the following calculation principle:
will 23-1-1 ═ 3 (binary 11, reduced order for mth order, then calculate 2m-1A value of-1, to split the element number) and 10 bitwise and-ing, resulting in 2 (binary 10). Bitwise negation of 3 (after negation, binary 1111 … … 1100, the number of binary digits depends on the data type, such as 64 bits, etc.), bitwise AND operation is performed on 10 to obtain 8(1000), and binary splitting of 10 is realized to obtain the first two digits and the last two digits of the binary of 10. The two second element numbers with the vertex determination values of 0 and 1 corresponding to the subscript 2 before the reduction are respectively: 8(1000) bit left shifted by one bit (10000) and 2(10) or operated to obtain 18 (10010); the reduced order is order 3, and for 18(10010) and 4 (equivalent to 1 < (3-1), i.e., for the mth order, a left shift of 1 < (m-1) is performed, 1. ltoreq. m.ltoreq.n +1) is ORed, yielding 22 (10110).
As the set determination value is 0, namely the value on the 3 rd bit of the element number corresponding to the 3 rd bit specific vertex is 0, the bit and the value on the bit of the binary element number are in one-to-one correspondence and are the same as the vertex values on the vertex bit and the vertex bit, the second element number which is obtained by the first element number 10 is determined to be 18, and A is obtained13254The element value p of the element number 18 of (1) as A after the reduction1354The first element value of element number 10. Similarly, A can be calculated1354The first element values corresponding to the rest first element numbers finally obtain the tensor A after the reduction1354={p0,p1,p2…p…p14,p15}。
Then, releasing the original reduced order front tensor A13254Occupied video memory.
Specifically, in order to reduce the computational complexity, one or more values of the target vertex equally divided by the master node may be received; wherein the target vertex is the first m vertices with the maximum number of connected edges in the current undirected graph, and the m vertices comprise 2mThe number of the calculation nodes is 2nN is a positive integer, and n is greater than 0 and less than or equal to m; splitting an undirected graph of the computing node into one or more sub-undirected graphs aiming at each evenly-divided vertex value; and traversing the edges connected with the target vertex aiming at each sub-undirected graph, and calling the GPU to respectively determine the values of the tensors of the edges connected with the target vertex and reduce the order.
The number of the computing nodes is a preset value, namely n is a preset value, and is related to the written quantum program and the available computing resources correspondingly configured, and the value m is preset according to needs. If n is 1 and m is 1, the vertex value of 1 vertex with the largest number of connected edges is originally uncertain, and there are 2 types: 0 or 1. The 2 value-taking conditions are distributed to 2 computing nodes for computing respectively, the vertex value of an undirected graph in one computing node is determined to be 0, a sub-undirected graph is obtained, the vertex value of the undirected graph in the other computing node is determined to be 1, the other sub-undirected graph is obtained, and the undirected graph is split.
For another example, n is 1, m is 2, and 4 values are given: 00. 01, 10, 11, then 2 compute nodes equally divide 4 values of the first 2 vertexes with the most connecting edges: 00. 01, 10, 11, generally divided equally in sequence, with one computing node divided into 00, 01 and the other computing node divided into 10, 11. Equivalently, an undirected graph in one compute node is split into 2 sub-undirected graphs, the 2 vertexes in one sub-undirected graph take values of 0 and 0 respectively, and the other one takes values of 0 and 1 respectively. Similarly, the two vertex values in one sub-undirected graph of another compute node are 1 and 0, respectively, and the two vertex values in the other sub-undirected graph are 1 and 1, respectively.
In terms of all the computing nodes, originally, each computing node comprises a same undirected graph, so that splitting is performed, the undirected graph of each computing node is split into one or more different sub-undirected graphs, each sub-undirected graph can be regarded as a Path, a first sub-amplitude corresponding to each Path inside one computing node is calculated, and the internal first sub-amplitudes are summed to obtain a sub-amplitude of a quantum state corresponding to the current node, so that the idea of a Path integration method (Feymann Path integration) is embodied, and the method is not based on a unitary matrix transformation method, because the latter causes memory occupation to increase exponentially.
Taking n-1 and m-2 as an example, vertex 3 and vertex 5 are the most connected edges in the current undirected graph. All edges connected to vertices 3 or 5 are assumed to be E3241、E361、E32、E354、E125、E45、E56Traversing the edges, calling the GPUs under the computation nodes with the values of 00, and determining the value reduction of the tensors of the edges in the sub-undirected graph respectively, wherein the principle is the same as the determination of the value reduction of the tensors of the edges connected to the specific vertex of the undirected graph, and the following steps are exemplified:
for edge E3241Corresponding to a 4 th order tensor T3241For the case of vertex 3, 5 taking the value 00, since the edge is not connected to vertex 5, it is only necessary to match T3241The 4 th order in which the vertex 3 is located is determined to be reduced. The GPU still applies for a new real part tensor and an imaginary part tensor first to obtain a reduced order n sent by a belonging computing node (CPU), and 2 is judgednWhether the number of the GPU thread blocks is less than num, if the number of the GPU thread blocks is less than 1, setting the number of the GPU thread blocks to be 2n/num。
Taking GTX1080ti as an example, num of GTX1080ti is 1024, thread _ id is 0, 1, 2 … … 1023. The vertex 3 takes a value of 0, i.e., the determination value is 0. A down-scaling operation is usually one down-scaling, so n is 3, 2nLess than 1024, thread block 1, blcok _ id 0. Reduced 3-order tensor T241Only 8 elements, starting with thread 0, one thread computes an element number, only thread 0-7 is invoked, and the first element number is computed using equation (2) as: idx is 0, 1, 2 … … 7.
For the first element number 5 (binary 101), the two second element numbers before the corresponding reduction are calculated according to the following calculation principle:
for the 4 th order, 24-1Bitwise and-ing 7 (binary 111) and 5 yields 5 (binary 101). And carrying out bitwise negation on 7 (after negation, binary 1111 … … 1000), carrying out bitwise AND operation on the 7 and the 5 to obtain 0, and realizing binary splitting on the 5. The two second element numbers with the determined values of 0 and 1 of the vertex 3 corresponding to the subscript 3 before the reduction are respectively: 0 is or-operated by shifting left by one bit and 5 to obtain 5 (0101); for 5(0101) and 8(1000) (equivalent to 1 < (4-1), i.e., for the mth order down-scaling, a left shift operation of 1 < (m-1) is performed, 1. ltoreq. m.ltoreq.n +1) is performed or operated, yielding 13 (1101).
Because each binary digit value of the element number represents the value of the corresponding vertex, 0101 represents T3241Edge E of3241Vertex 3, vertex 2, vertex 4 and vertex 1 are sequentially taken as 0, 1, 0 and 1, so that the finally-obtained second element number 5 as the first element number 5 is determined, and T is obtained3241The second element value with the number of the middle element being 5 is used as the reduced T241First element value w of middle element number 55. The first element values corresponding to the rest first element numbers can be calculated in the same way, and finally the tensor T with reduced order is obtained243={w0,w1,w2,w3,w4,w5,w6,w7}。
Then, releasing the original reduced order front tensor T3241Occupied video memory.
In addition, for edge E354In the case of the vertex 3 and the vertex 5 taking the value 00, since the order is reduced by only one step each time, the edge E can be first aligned354Tensor T of354The 3 rd order of the descending vertex 3 is T54In the presence of T54The 2 nd order of the descending point 5 is T4
The order reduction calculation of the edges and the value taking conditions in the other sub-undirected graphs is the same as the above, and is not described again.
Specifically, the merging, by cooperating with the GPU, all the connection edges of the vertex into a new edge includes:
determining a first edge and a second edge to be fused aiming at all connecting edges of the vertex; calling the GPU to perform upscaling on the first tensor of the first edge according to the vertex which is not connected with the first edge in the second edge, and updating the first tensor by the upscaled tensor; deleting the second edge, and connecting the vertex of the second edge, which is not connected with the first edge, to the first edge to obtain a fused middle edge; calling the GPU to calculate tensor elements of the middle edge according to the recorded corresponding relation between the vertex numbers of the first edge and the second edge; and returning to the step of determining the first edge and the second edge to be fused until the tensor element obtained by calculation is the tensor element of the last edge, and determining the last edge as a new edge to be fused.
In practical application, generally, for the fusion of any two edges, the tensor of one edge needs to be stepped up first, so that the edge with the largest order number in all the connection edges of the vertex is generally selected as the first edge.
It is understood that the second edge is an edge that is not fused in the remaining edges except for the first edge.
Specifically, one side can be directly selected from the remaining sides as the second side; or the connecting edges of the vertices may be sorted according to the order, and may be sorted from small to large or from large to small, without limitation, and then the edge with the largest order is determined as the first edge, and the edge with the largest order in the remaining unfused edges is determined as the second edge.
For example, vertex VnThe number of connected edges is 4, which are respectively: en1、En2、En3、En4The corresponding orders are respectively: 3. 2, 4 and 2. Firstly, sequencing the edges from large to small according to the order, and obtaining: en3、En1、En2、En4(or E)n3、En1、En4、En2) It can be seen that En3The order is maximum, it is determined as the first side, and three sides which are not fused are left, En1The order is largest and it is determined as the second edge.
And comparing the vertex connected with the first edge and the second edge, determining the vertex connected with the second edge but not connected with the first edge, then executing the step-up operation on the first tensor according to the determined vertex, and updating the first tensor by the new tensor after the step-up operation.
For example, suppose the vertex is numbered 2 and the corresponding first edge is E12Connecting vertices 1 and 2, the corresponding first tensor is A121, { 2, 3, 4 }; the second side is E23Connecting vertices 2 and 3, corresponding to the second tensor B235, 6, 7, 8. Comparing the two edges of the vertex with each other, it can be seen that the vertex 3 is a vertex connected to the second edge but not connected to the first edge. Accordingly, A is12Rising order of A123As will be understood from the following123After {1, 1, 2, 2, 3, 3, 4, 4}, the first volume corresponding to the first edge is updated to a123
Wherein the upscaling computation for the first tensor for the first edge is performed by the GPU, the principle may be as follows:
the GPU performs stepping on the first tensor of the first edge, and may calculate a tensor order t ═ r + s after stepping according to an order r of the first tensor and a stepped order s, where r and s are both positive integers; setting the number of thread blocks according to the upgraded tensor order t and the number num of threads in each thread block in the GPU, wherein if 2, the number is 2t Setting 1 thread block less than num, otherwise setting GPU thread block number to 2n(ii) num; calculating a first element number idx of the tensor after the step rising according to the thread block serial number block _ id, the thread number num in each thread block and the thread program number thread _ id, wherein the calculation formula is the same as the formula (2); the element of the tensor after the raising is calculated from the first element number idx, the raised order s, and the element of the first tensor.
Continuing with the example of GTX1080ti, instruction information for stepping up, the first tensor A before stepping up, is received from the compute node 121, 2, 3, 4, corresponding to element numbers 0, 1, 2, 3 (corresponding to binary 00, 01, 10, 11), up to a123R is 2, s is 1, and t is 3. Judgment 2tLess than num (1024), configure 1 thread block, block _ id 0. For the same reason, the thread _ id is 0, 1 … … 1023, because of a123Only 8 elements, only 0 need be calledThread No. 7, each thread is calculated by formula (2) to obtain A123Idx of (a) is 0, 1, 2 … … 7 in that order. Assign the first element numbered idx to: dst [ idx]=src[idx/2s]Denotes A with the first element number idx123Element value dst [ idx ]]Is equal to the element number idx/2sA of the rounded portion of (A)12The element value, the integer part, means the integer part left after the decimal point is removed.
From A123Idx of (2) calculating idx/2s0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5 in that order to idx/2, gives:
dst[0]=src[0]=1,dst[1]=src[0.5]=src[0]=1;
dst[2]=src[1]=2,dst[3]=src[1.5]=src[1]=2;
dst[4]=src[2]=3,dst[5]=src[2.5]=src[2]=3;
dst[6]=src[3]=4,dst[7]=src[3.5]=src[3]=4。
thereby obtaining A1231, 1, 2, 2, 3, 3, 4, 4 }. Similarly, if A is paired12Up to 2 th order A1234Obtaining A1234={1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4}。
Then the first tensor before the upgrade is the 10 th order tensor A12345678910For example, {1, 2, 3 … … 1024} corresponds to element numbers 0, 1, 2 … … 1023 (corresponding to binary 0000000000, 0000000001 … … 1111111111) rising to a1234567891011R is 10, s is 1, and t is 11. Judgment 2tGreater than num (1024), configuration 2tEach thread block has a thread number thread _ id of 0 to 1, and the number of the thread numbers thread _ id in each thread block is 0 to 1 … … 1023, and the number of the tensor elements after the upgrade is 211The idx calculated by the thread block 0 is 0, 1 and 2 … … 1023 in sequence, and the idx calculated by the thread block 1 is 1024 and 1025 … … 2047 in sequence, namely A1234567891011Idx of (a) is 0 to 2047.
Calculating idx/2s0, 0.5, 1, 1.5 … … 1023, 1023.5 in that order to give idx/2:
dst[0]=src[0]=1,dst[1]=src[0.5]=src[0]=1;
dst[2]=src[1]=2,dst[3]=src[1.5]=src[1]=2;
……
dst[2046]=src[1023]=1024,dst[2047]=src[1023.5]=src[1023]=1024。
thereby obtaining A1234567891011={1,1,2,2……1024,1024}。
And releasing the video memory occupied by the original first tensor after the upgraded tensor is obtained.
The first side is still taken as E12The second side is E23For example, the first quantity is updated to A123Then, delete the second side E23Connecting the vertex 3 of the second edge which is not connected with the first edge to obtain a fused intermediate edge E123. Then, according to the recorded corresponding relation between the vertex numbers of the first edge and the second edge, calling the GPU to calculate the tensor C of the middle edge123Of (2) is used.
The corresponding relation can be defined by the first quantity A123And a second tensor B23And (6) obtaining. Wherein A is123The corresponding vertex sequence is "123", B23The corresponding vertex sequence is "23". It can be seen that vertex 2 is at A123The corresponding vertex sequence is bit 2, at B23The vertex sequence is 1 st bit, and the vertex number 3 is A123The corresponding vertex sequence is bit 3, at B23The corresponding vertex sequence is the 2 nd bit, and the corresponding relation of the position is recorded. Specifically, the element structure of the maskray may be stored in an array, for example, in an array maskray: struct { vertex at A123Position in the corresponding vertex sequence, vertex at B23The position in the corresponding vertex sequence }.
Specifically, the tensor elements of the middle edge are calculated, and the steps may be as follows:
the GPU firstly applies for an array, copies Maskarray in a CPU (central processing unit) of a computing node to the GPU and stores the Maskarray in the array;
according to the updated order q of the first tensor and each line in the GPUSetting the number of thread blocks, if 2, of the number num of threads in the thread blockqSetting 1 thread block less than num, otherwise setting 2qA number of thread blocks,/num; calculating a first element number idx of the tensor of the middle edge according to the thread block serial number block _ id, the number num of threads in each thread block and the thread program number thread _ id, wherein the calculation formula is the same as the formula (2);
determining a corresponding element of each element in the first tensor in a second tensor of the second edge according to a corresponding relation stored in array; traversing each element in the first tensor to update the element by its product with its corresponding element in the second tensor.
In a first amount A 1231, 1, 2, 2, 3, 3, 4, 4 and a second tensor B23As an example of {5, 6, 7, 8}, it can be seen from the correspondence that the vertex 2 is at a123And B23The vertex numbers 3 of the corresponding 2 nd and 1 st positions in the vertex sequence are respectively at A123And B23Bit 3 and bit 2 in the corresponding vertex sequence.
First, the position number of the first vector element is expressed as a binary number (value corresponding to the vertex "123"): (000)2、(001)2、(010)2、(011)2、(100)2、(101)2、(110)2、(111)2the position number of the element in the second tensor is expressed as a binary number (corresponding to the value of the vertex "23"): (00)2、(01)2、(10)2、(11)2from vertices 2 and 3, respectively, at tensor A123And B23The position corresponding relation in the corresponding vertex sequence can determine A123Wherein each element is in B23The corresponding elements in (1) are shown in table 1:
TABLE 1A123Wherein each element is in B23Corresponding element in (1)
Figure BDA0002057592260000261
Wherein in the second row of Table 1, under binary numbersThe scribing is to clarify B23The middle vertex is at A123The values in the position numbers of each element are convenient for clarity and have no limiting meaning.
Each element in the first tensor is traversed, multiplied by a corresponding element in the second tensor, and the product is then used to update the element.
With A123And B23For example, A is first introduced123Each element in (1) and B23Multiplying the corresponding elements in (1), wherein the obtained products are sequentially: 5. 6, 14, 16, 15, 18, 28, 32, and then uses these products to update a123Obtaining a middle edge E123Tensor C of123={5,6,14,16,15,18,28,32}。
Wherein, the reduced order calculation of the tensor for the new edge is executed by the GPU, and the process may be as follows:
firstly, the GPU sets the number of thread blocks according to the tensor order of the new edge after order reduction;
secondly, calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit.
It should be noted that the two steps are consistent with the calculation principle of the step corresponding to the determined value reduction, and are not described herein again.
Then, two second element values corresponding to the two second element numbers one to one are obtained, the two second element values are summed, and the sum is determined as the first element value corresponding to the first element number.
It can be seen that, compared with the calculation principle of deterministic value reduction, the difference is that the deterministic value reduction needs to find out the second element number L1 whose value at the position corresponding to a vertex is the preset deterministic value from the two second element numbers L1 and L2, so as to obtain the second element value corresponding to the second element number L1 as the first element value corresponding to the first element number, and the reduction calculation needs to add the two second element values corresponding to the two second element numbers L1 and L2, respectively, to obtain the sum as the first element value corresponding to the first element number.
Referring to the specific implementation and example of the above-mentioned deterministic value reduction, assume a 5 th order tensor A13254The tensor of a new edge after the final fusion of the vertex 2 is consistent with other conditions. For the first element number 10 (binary 1010), the two second element numbers before their corresponding reduction are calculated as 18(10010), 22 (10110). Because the vertex 2 at the 3 rd order has an undetermined value (may be 0 or 1), the two element numbers 18 and 22 are both corresponding element numbers of 10, the element values p and p 'corresponding to the two element numbers 18 and 22 are obtained and added, the obtained sum p' is the first element value of the first element number 10, and so on, and finally the A is obtained13254Reduced tensor A1354
For another example, assuming the vertex is vertex 2, after all the connected edges are fused, a new edge E is obtained1234Its tensor is A12345, 5, 6, 6, 14, 14, 16, 16, 15, 15, 18, 18, 28, 28, 32, 32. The corresponding vertex sequence of the new edge tensor is 1234, except vertex 2, a new vertex sequence 134 is obtained, and the value of the new vertex sequence and A are taken1234The corresponding relationship of the elements in (1) is shown in table 2:
TABLE 2 values of the New vertex sequence "134" and A1234Corresponding relation of middle element
Figure BDA0002057592260000271
Figure BDA0002057592260000281
Thus, for the target edge E1234After the order of the vertex 2 is reduced, the vertex 2 is deleted, and a new edge E after the order is reduced is obtained134Corresponding to tensor A134={19,19,22,22,43,43,50,50}。
And S204, returning the sub-amplitudes to the main node of the distributed cluster so that the main node reduces each sub-amplitude to obtain the amplitude of the quantum state as a target single amplitude.
The reduction is data reduction, which means that the data volume is reduced to the maximum extent on the premise of keeping the original appearance of the data as much as possible. And the main node sums all the sub-amplitudes by stipulating the sub-amplitudes calculated by each calculation node to obtain the target single amplitude of the measured quantum state.
The existing CPU micro-architecture is designed for high efficiency of instruction execution, and has strong performance and high efficiency of logic processing (instruction execution), but the GPU has a large number of threads (hundreds of thousands) and is dedicated to large-scale concurrent computation, and the numerical computation efficiency of the CPU micro-architecture is usually about 5 to 10 times higher than that of the CPU. Based on this, in the embodiment of the present invention, the related main computation tasks include determining the tensor element computation of the order reduction, the order increase, the middle edge of the fusion process, and the like, which are all distributed to the GPU subordinate to each computation node CPU for execution, and the CPU mainly executes the logic processing task, and the two tasks cooperate with each other, so that the single-amplitude quantum computation simulation efficiency is at a higher level.
Therefore, the method can calculate only one target single amplitude of the involved qubits at a time, specifically, map the target quantum program onto the undirected graph, split the undirected graph onto a plurality of computing nodes by combining a path integration method, and calculate the corresponding sub-undirected graph by matching each computing node with the subordinate GPU. The whole calculation process is mostly based on simple operation of elements in tensor, compared with full-amplitude simulation based on unitary matrix in the prior art, the requirement on the memory is greatly reduced, and the calculation amount does not rise along with the index of the quantum bit, so that quantum calculation simulation related to 50 or more quantum bits can be realized; the GPU has stronger performance of executing massive parallel computation, so that the simulation efficiency of the whole quantum computation is higher. At present, quantum computation simulation involving 196 qubits can be realized at most by applying the technical solution provided by the embodiments of the present invention.
In addition, in practical application, sometimes only one or more amplitudes in the full amplitude of the qubits are needed, and in this case, if the full amplitude mode in the prior art is adopted, that is, all the amplitudes are simulated at one time, the waste of resources such as a memory and time is undoubtedly caused; by applying the method provided by the embodiment of the invention, one or more times of simulation can be performed in a targeted manner, and one or more single amplitudes required can be simulated, so that resources and time are greatly saved.
The construction, features and functions of the present invention are described in detail in the embodiments illustrated in the drawings, which are only preferred embodiments of the present invention, but the present invention is not limited by the drawings, and all equivalent embodiments modified or changed according to the idea of the present invention should fall within the protection scope of the present invention without departing from the spirit of the present invention covered by the description and the drawings.

Claims (10)

1. A single amplitude quantum computational simulation method, the method comprising:
obtaining a target quantum program for each computing node of the distributed cluster;
constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the operation of the quantum logic gate, and one edge of the undirected graph corresponds to a tensor;
obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and matched with the GPU corresponding to the calculation node; wherein the sub-amplitude is an amplitude corresponding to the undirected graph;
and returning the sub-amplitudes to the main node of the distributed cluster so that the main node reduces each sub-amplitude to obtain the amplitude of the quantum state as a target single amplitude.
2. The single-amplitude quantum computation simulation method of claim 1, characterized in that: the constructing of the undirected graph corresponding to the target quantum program comprises:
analyzing the target quantum program to obtain a linked list for recording quantum program information;
traversing the linked list, and creating an edge with a tensor order of 1 when the type of the quantum logic gate in the linked list is a first single quantum gate; wherein the edge is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second single quantum gate, creating an edge with the tensor order of 2 and a vertex connected with the edge; the edge is connected with the last vertex of the corresponding vertex chain of the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, an edge with the tensor order of 2 is created; the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the first double-quantum gate, and the unitary matrix of the first double-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double quantum gate, an edge with the tensor order of 4 and two vertexes connected with the edge are created; the edge is connected with the last vertex in the vertex chain respectively corresponding to the two qubits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and obtaining an undirected graph corresponding to the target quantum program.
3. The single-amplitude quantum computation simulation method of claim 2, characterized in that: the calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and matched with the GPU corresponding to the calculating node comprises the following steps:
calling a GPU corresponding to the computing node, and respectively determining the tensors of edges connected with specific vertexes of the undirected graph to reduce the order; wherein the specific vertex is the first and last vertex of the vertex chain corresponding to each qubit;
deleting the particular vertex;
receiving a value of a target vertex allocated by the master node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction of tensors of connecting edges of the target vertex aiming at each sub-undirected graph obtained by splitting;
aiming at each vertex in the sub-undirected graph, combining the GPU to fuse all connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex;
taking product of tensor values of all the reduced new edges to obtain a first sub-amplitude of the quantum state corresponding to the sub-undirected graph;
and summing the first sub-amplitudes of all the sub-undirected graphs in the quantum state to obtain the sub-amplitude of the quantum state.
4. The single-amplitude quantum computation simulation method of claim 3, wherein: the determining the values of the tensors of the edges connected to the specific vertexes of the undirected graph and reducing the orders respectively comprises:
the GPU corresponding to the computing node sets the number of thread blocks according to the reduced tensor order and the number of threads in each thread block in the GPU aiming at the edge connected with each specific vertex;
calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit;
determining a second element number with a preset determination value on the number position corresponding to the specific vertex position from the two second element numbers;
and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
5. The single-amplitude quantum computation simulation method of claim 4, wherein: the receiving of the value of the target vertex allocated by the master node, splitting the current undirected graph based on the value of the target vertex, and calling the GPU to determine value reduction of the tensors of the connecting edges of the target vertex for each sub-undirected graph obtained by the splitting, includes:
receiving one or more values of the target vertex equally divided by the main node; wherein the target vertex is the first m vertices with the maximum number of connected edges in the current undirected graph, and the m vertices comprise 2mThe number of the calculation nodes is 2nN is a positive integer, and n is greater than 0 and less than or equal to m;
splitting an undirected graph of the computing node into one or more sub-undirected graphs aiming at each evenly-divided vertex value;
and traversing the edges connected with the target vertex aiming at each sub-undirected graph, and calling the GPU to respectively determine the tensors of the edges connected with the target vertex and reduce the order.
6. The single-amplitude quantum computation simulation method of claim 3, wherein: the matching with the GPU to fuse all the connecting edges of the vertex into a new edge includes:
determining a first edge and a second edge to be fused aiming at all connecting edges of the vertex;
calling the GPU to perform upscaling on the first tensor of the first edge according to the vertex which is not connected with the first edge in the second edge, and updating the first tensor by the upscaled tensor;
deleting the second edge, and connecting the vertex of the second edge, which is not connected with the first edge, to the first edge to obtain a fused middle edge;
calling the GPU to calculate tensor elements of the middle edge according to the recorded corresponding relation between the vertex numbers of the first edge and the second edge;
and returning to the step of determining the first edge and the second edge to be fused until the tensor element obtained by calculation is the tensor element of the last edge, and determining the last edge as a new edge to be fused.
7. The single-amplitude quantum computation simulation method of claim 6, wherein: the step up of the first magnitude of the first edge comprises:
the GPU calculates the tensor order after the order is increased according to the order of the first tensor and the increased order;
setting the number of thread blocks according to the tensor order after the upgrade and the number of threads in each thread block in the GPU;
calculating the first element number of the tensor after the upgrade according to the thread block serial number, the thread number in each thread block and the line program number;
the element of the tensor after the ascending order is calculated according to the first element number, the ascending order and the element of the first tensor.
8. The single-amplitude quantum computation simulation method of claim 6, wherein: the computing tensor elements for the intermediate edges includes:
the GPU sets the number of thread blocks according to the updated order of the first tensor and the number of threads in each thread block in the GPU;
calculating the first element number of the tensor of the middle edge according to the thread block serial number, the number of threads in each thread block and the line program number;
determining a corresponding element of each element in the first tensor in a second tensor of the second edge according to the corresponding relation;
traversing each element in the first tensor to update the element by its product with its corresponding element in the second tensor.
9. The single-amplitude quantum computation simulation method of claim 6, wherein: the reducing the tensor of the new edge includes:
the GPU sets the number of thread blocks according to the tensor order of the new edge after order reduction;
calculating a first element number of the tensor after the reduction according to the thread block serial number, the number of threads in each thread block and the line program number, and calculating two second element numbers of the tensor before the reduction corresponding to the first element number; the number of the element number corresponds to the number of the vertex bits connected with the current edge one by one, and the value of each bit of the element number is the value of the vertex of the corresponding vertex bit;
and acquiring two second element values corresponding to the two second element numbers one to one, summing the two second element values, and determining the sum as a first element value corresponding to the first element number.
10. The single amplitude quantum computation simulation method of any one of claims 4 to 9, wherein: the calculation formula of the first element number is as follows:
Idx=block_id*num+thread_id
the Idx is a first element number, block _ id is a thread block serial number, num is the number of threads in each thread block, and thread _ id is a thread program number.
CN201910394102.9A 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method Active CN111931939B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910394102.9A CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910394102.9A CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Publications (2)

Publication Number Publication Date
CN111931939A true CN111931939A (en) 2020-11-13
CN111931939B CN111931939B (en) 2024-02-09

Family

ID=73282551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910394102.9A Active CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Country Status (1)

Country Link
CN (1) CN111931939B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114692880A (en) * 2020-12-31 2022-07-01 合肥本源量子计算科技有限责任公司 Simulation method and device for quantum state amplitude in quantum line
WO2022143224A1 (en) * 2020-12-29 2022-07-07 合肥本源量子计算科技有限责任公司 Amplitude estimation method and device for quantum circuit, storage medium, and electronic device
CN114764549A (en) * 2020-12-31 2022-07-19 合肥本源量子计算科技有限责任公司 Quantum line simulation calculation method and device based on matrix product state

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120071333A1 (en) * 2010-07-26 2012-03-22 Tampere University Of Technology Uses of systems with degrees of freedom poised between fully quantum and fully classical states
US20120327287A1 (en) * 2007-12-06 2012-12-27 U.S. Government As Represented By The Secretary Of The Army Method and system for producing image frames using quantum properties
CN103038956A (en) * 2010-03-19 2013-04-10 多伦多大学董事局 Amplitude and phase modulation of a laser by modulation of an output coupler
CN105960651A (en) * 2013-12-05 2016-09-21 微软技术许可有限责任公司 A method and system for computing distance measures on a quantum computer
CN108833353A (en) * 2018-05-18 2018-11-16 中南大学 The quantum Byzantium Agreement Methods participated in based on tripartite

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120327287A1 (en) * 2007-12-06 2012-12-27 U.S. Government As Represented By The Secretary Of The Army Method and system for producing image frames using quantum properties
CN103038956A (en) * 2010-03-19 2013-04-10 多伦多大学董事局 Amplitude and phase modulation of a laser by modulation of an output coupler
US20120071333A1 (en) * 2010-07-26 2012-03-22 Tampere University Of Technology Uses of systems with degrees of freedom poised between fully quantum and fully classical states
CN105960651A (en) * 2013-12-05 2016-09-21 微软技术许可有限责任公司 A method and system for computing distance measures on a quantum computer
CN108833353A (en) * 2018-05-18 2018-11-16 中南大学 The quantum Byzantium Agreement Methods participated in based on tripartite

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANXIN CHEN 等: "Classical Simulation of Intermediate-Size Quantum Circuits", 《ARXIV:1805.01450V2》, pages 1 - 12 *
魏永鑫 等: "单量子比特量子计算中的全局相位", 《福建师范大学学报(自然科学报)》, vol. 29, no. 3, pages 42 - 46 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022143224A1 (en) * 2020-12-29 2022-07-07 合肥本源量子计算科技有限责任公司 Amplitude estimation method and device for quantum circuit, storage medium, and electronic device
CN114757357A (en) * 2020-12-29 2022-07-15 合肥本源量子计算科技有限责任公司 Amplitude estimation method and device for quantum line, storage medium and electronic device
CN114757357B (en) * 2020-12-29 2023-06-02 合肥本源量子计算科技有限责任公司 Quantum circuit amplitude estimation method and device, storage medium and electronic device
US11900220B2 (en) 2020-12-29 2024-02-13 Origin Quantum Computing Technology (Hefei) Co., Ltd Method and apparatus for amplitude estimation of quantum circuit, storage medium, and electronic apparatus
CN114692880A (en) * 2020-12-31 2022-07-01 合肥本源量子计算科技有限责任公司 Simulation method and device for quantum state amplitude in quantum line
CN114764549A (en) * 2020-12-31 2022-07-19 合肥本源量子计算科技有限责任公司 Quantum line simulation calculation method and device based on matrix product state
CN114764549B (en) * 2020-12-31 2023-04-25 合肥本源量子计算科技有限责任公司 Quantum circuit simulation calculation method and device based on matrix product state
CN114692880B (en) * 2020-12-31 2023-09-05 本源量子计算科技(合肥)股份有限公司 Quantum state amplitude simulation method and device in quantum circuit

Also Published As

Publication number Publication date
CN111931939B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
JP7186797B2 (en) Method and system for quantum computing
WO2020151129A1 (en) Quantum machine learning framework construction method and apparatus, and quantum computer and computer storage medium
CN110826719B (en) Quantum program processing method and device, storage medium and electronic device
CN110516810B (en) Quantum program processing method and device, storage medium and electronic device
CN111914378B (en) Single-amplitude quantum computing simulation method and device
CN111915011B (en) Single-amplitude quantum computing simulation method
CN111931939A (en) Single-amplitude quantum computation simulation method
CN114764549B (en) Quantum circuit simulation calculation method and device based on matrix product state
US11803360B2 (en) Compilation method, apparatus, computing device and medium
Sarkar et al. An algorithm for DNA read alignment on quantum accelerators
JP2022068327A (en) Node grouping method, apparatus therefor, and electronic device therefor
CN114219076A (en) Quantum neural network training method and device, electronic device and medium
JP2023510706A (en) Distributed Tensor Network Reduction Method by Partitioning Based on Dynamic Ordering
Hoppe et al. A parallel modular computing environment for three-dimensional multiresolution simulations of compressible flows
CN115358407A (en) Approximate quantum compiling method and system based on tensor network and electronic equipment
US20170140072A1 (en) Method and system for determining a configuration of a model having a collection of entities and satisfying a set of constraints
JP7381723B2 (en) Quantum operation execution method and device, quantum operation control waveform generation method and device, quantum operation chip, computer device and program
CN113128015B (en) Method and system for predicting resources required by single-amplitude analog quantum computation
KR20230029759A (en) Generating sparse modifiable bit length determination pulses to update analog crossbar arrays
CN112216353A (en) Method and device for predicting drug-target interaction relationship
WO2020221583A1 (en) System and method for molecular design on a quantum computer
Hoppe et al. A modular massively parallel computing environment for three-dimensional multiresolution simulations of compressible flows
CN112149269A (en) Optimization device, control method of optimization device, and recording medium
JP2022062274A (en) Function processing method, device, and electronic apparatus
CN113887730A (en) Quantum simulator implementation method and device, related equipment and quantum simulation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 230008 6th floor, building E2, phase II, venture industrial park, high tech Zone, Hefei City, Anhui Province

Applicant after: Benyuan Quantum Computing Technology (Hefei) Co.,Ltd.

Address before: 230008 6th floor, building E2, phase II, venture industrial park, high tech Zone, Hefei City, Anhui Province

Applicant before: ORIGIN QUANTUM COMPUTING COMPANY, LIMITED, HEFEI

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant