WO2024109128A1 - 基于神经网络的量子纠错解码方法、装置、设备及芯片 - Google Patents

基于神经网络的量子纠错解码方法、装置、设备及芯片 Download PDF

Info

Publication number
WO2024109128A1
WO2024109128A1 PCT/CN2023/108856 CN2023108856W WO2024109128A1 WO 2024109128 A1 WO2024109128 A1 WO 2024109128A1 CN 2023108856 W CN2023108856 W CN 2023108856W WO 2024109128 A1 WO2024109128 A1 WO 2024109128A1
Authority
WO
WIPO (PCT)
Prior art keywords
decoding
feature
error
information
neural network
Prior art date
Application number
PCT/CN2023/108856
Other languages
English (en)
French (fr)
Inventor
郑一聪
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2024109128A1 publication Critical patent/WO2024109128A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N10/00Quantum computing, i.e. information processing based on quantum-mechanical phenomena
    • G06N10/70Quantum error correction, detection or prevention, e.g. surface codes or magic state distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present application relate to the fields of artificial intelligence and quantum technology, and in particular to a quantum error correction decoding method, device, equipment and chip based on a neural network.
  • the corresponding error symptom information is obtained by measuring the symptoms of the quantum circuit, and then the error symptom information is decoded to determine the quantum bits in the quantum circuit where errors occur and the corresponding error types.
  • some schemes for decoding error symptom information are provided, such as a decoding scheme based on MWPM (Minimum Weight Perfect Matching), a decoding scheme based on RG (Renormalization Group) algorithm, a decoding scheme based on CA (Celluar Automaton), a decoding scheme based on neural networks, and so on.
  • the decoding scheme based on neural networks still has some shortcomings in decoding capabilities.
  • the embodiment of the present application provides a quantum error correction decoding method, device, equipment and chip based on neural network.
  • the technical solution is as follows:
  • a neural network-based quantum error correction decoding method which is executed by a control device, and includes: obtaining error symptom information obtained by symptom measurement of a quantum circuit; extracting feature information from the error symptom information by a neural network decoder; decoding the feature information by the neural network decoder to obtain a decoding result; and determining error result information of the quantum circuit based on the decoding result.
  • a quantum error correction decoding device based on a neural network comprising:
  • a symptom acquisition module used to obtain error symptom information obtained by performing symptom measurement on the quantum circuit
  • a feature extraction module used for extracting feature information from the error symptom information through a neural network decoder
  • a feature decoding module used to decode the feature information through the neural network decoder to obtain a decoding result
  • a result determination module is used to determine error result information of the quantum circuit according to the decoding result.
  • a computer device including a processor and a memory, wherein a computer program is stored in the memory, and the computer program is loaded and executed by the processor to implement the above method.
  • a computer-readable storage medium in which a computer program is stored.
  • the computer program is loaded and executed by a processor to implement the above method.
  • a computer program product includes a computer program.
  • the computer program is loaded and executed by a processor to implement the above method.
  • a chip is provided, wherein the chip is deployed with a neural network decoder, and the neural network decoder is used to implement the above method.
  • the embodiment of the present application provides an error correction decoding solution based on a multi-task learning neural network model.
  • the network decoder extracts corresponding feature information from the input error symptom information, and then decodes the feature information through the neural network decoder, outputs the decoding result of the local information distribution of the noise decomposition, and then determines the error result information according to the decoding result.
  • the solution of the present application only needs a single neural network decoder to accurately determine the error result information, thereby fully improving the decoding performance and shortening the decoding time without increasing the complexity of the algorithm and maintaining scalability, and it is also easier to implement hardware deployment in engineering.
  • FIG1 is a schematic diagram of a rotating surface code according to an embodiment of the present application.
  • FIG2 is a schematic diagram showing a surface code error occurrence according to an embodiment of the present application.
  • FIG3 is a schematic diagram showing a comparison of decoding performance and decoding time of several decoding schemes according to an embodiment of the present application
  • FIG4 is a schematic diagram of an application scenario of a solution provided by an embodiment of the present application.
  • FIG5 is a schematic diagram of an error correction decoding process involved in the application scenario of the solution shown in FIG4 ;
  • FIG6 is a schematic diagram of a simple representation of a symptom point correspondence in a rotating surface code provided by an embodiment of the present application.
  • FIG7 is a schematic diagram of a symptom measurement circuit provided by an embodiment of the present application.
  • FIG8 is a schematic diagram of a three-dimensional symptom distribution provided by an embodiment of the present application.
  • FIG9 is an architecture diagram of decoding using multiple neural network models provided by one embodiment of the present application.
  • FIG10 is a flow chart of a quantum error correction decoding method based on a neural network provided by an embodiment of the present application.
  • FIG11 is an architecture diagram of a neural network decoder based on multi-task learning provided by one embodiment of the present application.
  • FIG12 is a schematic diagram of a decoding process of a first type of decoder provided in one embodiment of the present application.
  • FIG13 is a schematic diagram of a decoding process of a second type of decoder provided in an embodiment of the present application.
  • FIG14 is a schematic diagram of a decoding process of a second type of decoder provided in another embodiment of the present application.
  • FIG15 is a schematic diagram of a feature extraction subnetwork for performing local feature extraction mapping according to an embodiment of the present application.
  • FIG16 is a schematic diagram of the correlation noise generated by the symptom measurement circuit provided by one embodiment of the present application.
  • FIG17 is a schematic diagram of physical quantum bit division provided by one embodiment of the present application.
  • FIG18 is a schematic diagram of physical quantum bit division provided by another embodiment of the present application.
  • FIG19 is a schematic diagram of a multi-core architecture provided by an embodiment of the present application.
  • FIG20 is a schematic diagram of experimental result data corresponding to the first error decomposition method provided by an embodiment of the present application.
  • FIG21 is a schematic diagram of chip deployment provided by an embodiment of the present application.
  • FIG22 is a schematic diagram of experimental result data corresponding to the first error decomposition method provided by another embodiment of the present application.
  • FIG23 is a schematic diagram of experimental result data corresponding to a second error decomposition method provided by an embodiment of the present application.
  • FIG24 is a block diagram of a quantum error correction decoding device based on a neural network provided by one embodiment of the present application.
  • FIG. 25 is a schematic diagram of the structure of a computer device provided in one embodiment of the present application.
  • Quantum Computation A method of using the superposition and entanglement properties of quantum states to quickly complete specific computing tasks.
  • Quantum Error Correction A method of encoding a quantum state by mapping it to a subspace in the Hilbert space of a multi-body quantum system. Quantum noise will migrate the encoded quantum state to other subspaces. By continuously observing the space where the quantum state is located (symptom extraction), it is possible to evaluate and correct the quantum noise without interfering with the encoded quantum state, thereby protecting the encoded quantum state from interference from quantum noise.
  • a [[n,k,d]] quantum error correction code represents encoding k logical quantum bits in n physical quantum bits, which is used to correct any An error can occur in any single qubit.
  • Data quantum state The quantum state of the data quantum bit used to store quantum information during quantum computing.
  • Stabilizer generator also called parity check operator.
  • the occurrence of quantum noise (error) will By changing the eigenvalues of certain stable sub-generators, quantum error correction can be performed based on this information.
  • a stabilizer group is a group generated by stabilizer generators.
  • the Abelian group generated by stabilizer generators is called a stabilizer generator group. If there are k stabilizer generators, then the stabilizer group includes 2 k elements, which is an Abelian group.
  • Error syndrome When there is no error, the eigenvalue of the stable subgenerator is 0; when quantum noise occurs, the eigenvalue of some stable subgenerators (parity check operators) of error correction codes that are anti-commutative to the error will become 1.
  • the bit string composed of these 0 and 1 syndrome bits is called error syndrome.
  • Topological quantum error correction code A special category of quantum error correction codes.
  • the quantum bits of this type of error correction code are distributed on a lattice array greater than 2 dimensions.
  • the lattice forms a discrete structure of a high-dimensional manifold.
  • the stable sub-generators of the error correction code are defined on geometrically neighboring and finite quantum bits, so they are geometrically localized and physically easy to measure.
  • the quantum bits of the logical operators of this type of error correction code constitute a type of topologically non-trivial geometric object on the manifold of the lattice array.
  • Surface code is a type of topological quantum error correction code defined on a two-dimensional manifold. Its stable sub-generator is usually supported by 4 qubits (supported by 2 qubits at the boundary), and the logical operator is a non-trivial chain across the array in a strip shape.
  • the specific two-dimensional structure of the surface code (7 ⁇ 7, including 49 data qubits and 48 auxiliary qubits, a total of 97 physical qubits, which can correct any errors occurring on two qubits) is shown in Figure 1: the black dot 11 represents the data qubit used for quantum computing, and the cross 12 represents the auxiliary qubit.
  • the auxiliary qubit is initially prepared in the
  • the diagonally filled and white-filled squares represent two different types of stable sub-generators, which are used to detect Z errors and X errors, respectively.
  • Surface code scale L one-fourth of the circumference of the surface code array.
  • X and Z errors Randomly generated Pauli X and Pauli Z evolution errors generated in the quantum state of a physical quantum bit. According to quantum error correction theory, if the error correction code can correct X and Z errors, then any error occurring in a single quantum bit can be corrected.
  • Fault-tolerant Quantum Error Correction All operations in real quantum computing, including quantum gates and quantum measurements, are noisy. That is to say, even the circuits used for quantum error correction contain noise. Fault-tolerant quantum error correction means that by cleverly designing the error correction circuit, the error correction circuit with noise can be used for error correction, and the purpose of correcting errors and preventing errors from spreading over time can still be achieved.
  • Fault Tolerant Quantum Computation In the process of quantum computing, any physical operation is noisy, including the quantum error correction circuit itself and the quantum bit measurement. If it is assumed that classical operations (such as command input, error correction code decoding, etc.) are noise-free, then fault-tolerant quantum computing is a technical solution that ensures that errors can be effectively controlled and corrected in the process of quantum computing using noisy quantum bits by reasonably designing quantum error correction schemes and performing specific quantum gate operations on the encoded logical quantum states.
  • Physical quantum bit A quantum bit implemented using real physical devices.
  • Logical qubit A mathematical degree of freedom in the Hilbert subspace defined by an error correction code. Its quantum state is usually described as a multi-body entangled state, which is generally a two-dimensional subspace of the Hilbert space of multiple physical qubits. Fault-tolerant quantum computing needs to run on logical qubits protected by error correction codes.
  • Quantum gate/circuit Quantum gate/circuit that acts on physical quantum bits.
  • Threshold theorem For quantum computing schemes that meet the requirements of fault-tolerant quantum computing, when the error rate of all operations is lower than a certain threshold, the accuracy of the calculation can be arbitrarily approached to 1 by using better error correction codes, more quantum bits, and more quantum operations. At the same time, these additional resource overheads are negligible compared to the exponential acceleration of quantum computing.
  • Neural network An artificial neural network is an adaptive nonlinear dynamic system composed of a large number of simple basic elements, neurons, which are interconnected. The structure and function of each neuron are relatively simple, but the system behavior generated by a large number of neuron combinations is very complex and can express any function in principle.
  • Convolutional neural network is a type of feed-forward neural network with deep structure and convolution operation.
  • Convolutional layer is the core cornerstone of convolutional neural network, which is a discrete two-dimensional or three-dimensional filter (also called convolution kernel, which is a two-dimensional or three-dimensional matrix) convolving with two-dimensional or three-dimensional data lattice.
  • Leaky Rectified Linear Unit Layer (LeakyReLU Unit) is an activation function based on ReLU, but its slope for negative values is very small instead of a gentle slope.
  • BP Back Propagation
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • CPLD Complex Programmable Logic Device
  • FPGA Field programmable Logic Device
  • Single Flux Quantum (SFQ) Circuit Also called RSFQ (Rapid Single Flux Quantum) circuit, it is a circuit composed of Josephson Junction (JJ), which represents "1" and "0” by the presence or absence of magnetic flux quantum. "X” is used to represent a Josephson junction in the circuit.
  • the upper and lower layers are composed of superconductors, and the middle layer is composed of a very thin layer of insulator. It can be used for digital logic calculations.
  • Multi-task learning In this application, it is defined as using the same neural network model to complete multiple classification tasks at the same time. Multiple neural networks are combined into a large neural network, and part of the neural network is shared as much as possible, so as to achieve the purpose of reducing the overall computational complexity and space complexity.
  • Canonical Pauli Operator For a particular stable subcode, an equivalent class of Pauli operators can be represented by a representative Pauli operator. This representative Pauli operator can be called the canonical Pauli operator of the operator class.
  • Adam Adaptive moment estimation: An algorithm for first-order gradient optimization of a random objective function based on adaptive low-order moment estimation.
  • the Adam algorithm is easy to implement and has high computational efficiency and low memory requirements.
  • the solution provided in the embodiments of the present application relates to the application of artificial intelligence machine learning technology in the field of quantum technology, and specifically to the application of machine learning technology in the decoding algorithm of quantum error correction code, which is specifically illustrated by the following embodiments.
  • error correction code when an error occurs, the error symptoms (error syndrome) can be obtained through parity checking; then, based on these symptoms, a specific decoding algorithm for the error correction code is needed to determine the location and type of the error (whether it is an X error, a Z error, or both, that is, a Y error).
  • error and error symptoms have specific spatial locations: when an error causes symptoms, the eigenvalue of the auxiliary quantum bit at the corresponding position is 1 (which can be regarded as a point particle appearing at that position), and when there is no error, the eigenvalue of the auxiliary quantum bit at the corresponding position is 0.
  • decoding can be reduced to the following problem: given a spatial digital array (2D or 3D, with a value of 0 or 1), according to a specific error occurrence model (error model) - the probability distribution of errors occurring on quantum bits - infer which quantum bits are most likely to have errors, and the specific error types, and perform error correction based on this inference result.
  • error model error model
  • FIG 2 it shows a schematic diagram of the occurrence of surface code errors.
  • the qubits are on the edge of the two-dimensional array, and the auxiliary qubits that measure the error symptoms are on the nodes of the two-dimensional array (these symptoms are perfectly measured).
  • the black edge 21 in Figure 2 represents the error chain formed by the qubits where the error occurs, and the circle part 22 filled with slashes represents the point where the symptom value caused by the error is 1. As long as the chain error can be determined by the point-like symptoms, the decoding can be completed.
  • the decoding algorithm of the error correction code also called a decoder
  • the corresponding error result information such as the location and type of the error
  • the decoding capability of a decoder can be measured by the following key indicators: decoding algorithm complexity, decoding time, decoding performance, suitability for real-time error correction, and engineering implementation difficulty.
  • Decoding algorithm complexity refers to the total basic calculation steps required to operate the decoding algorithm, corresponding to the computational complexity. The higher the complexity, the greater the amount of calculation required.
  • Decoding time The time here is an abstract concept, which is different from the actual decoding time, but has a strong correlation. It refers to the algorithm depth after the decoding algorithm is fully parallelized. This depth determines the lower limit of the actual running time of the decoding algorithm, that is, the algorithm running time required after maximum parallelization.
  • Decoding performance After decoding and error correction according to a specific noise model, it is measured by the error rate that occurs on the logical qubit. For the same physical qubit error rate, the lower the logical error rate, the better the decoding performance.
  • Suitable for real-time error correction Due to the short life of quantum bits (for example, the life of superconducting quantum bits is about 150 microseconds under a better process. After multiple rounds of symptom measurement, real-time decoding and error correction are performed based on these symptoms. During the decoding process, the system is in an idle state, and errors gradually accumulate over time. Theoretically, the entire error correction process takes less than 1/1000-1/100 of the life of the superconducting quantum bit. In other words, the rigid margin of the entire error correction time is about 150ns-1500ns, otherwise the error rate may exceed the error correction capability of the surface code), the CPU (Central Processing Unit) and GPU (Graphics Processing Unit) have memory read and write time uncertainty, cache hit uncertainty, and branch jumps.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • the computing microstructure of the CPU/GPU is not optimized for the decoding algorithm, so the performance indicators cannot be achieved for the sake of versatility.
  • This application considers porting the decoding algorithm to specific computing devices such as FPGA or ASIC. Such devices are more suitable for running simple steps in parallel (such as vector inner products, matrix multiplications, etc.), and are not suitable for running complex instructions with conditional judgments and jumps.
  • Difficulty of engineering implementation refers to whether it is easy to deploy the decoder in hardware in engineering.
  • the theoretical time complexity of the real-time decoding algorithm is low, but in reality, the control is more complicated or the actual amount of calculation is still large, requiring multiple computing devices to cooperate in parallel computing.
  • the delay caused by the communication between chips may even be greater than the delay of the calculation itself, which is unacceptable in actual real-time decoding. Therefore, an algorithm that is easy to implement in engineering needs to really reduce the amount of calculation to reduce the number of computing devices used and the communication between computing devices (chips).
  • the on-chip cache that can be reserved is limited, which requires that too much data cannot be preloaded on the chip.
  • this requires that the network parameters that can be used should not be too many, and the error correction code scale should not grow too fast, and can be pre-placed in the chip's on-chip memory for easy reading.
  • Some of the currently known quantum error correction decoding schemes include a decoding scheme based on MWPM, a decoding scheme based on the RG algorithm, a decoding scheme based on CA, a decoding scheme based on MCMC (Monte Carlo Markov Chain), a decoding scheme based on MLD (Maximum Likelihood Decoding), and a decoding scheme based on NN (Neural Network) decoding scheme, etc.
  • Figure 3 gives a rough comparison.
  • the black dots corresponding to MWPM are used to show the decoding performance and decoding time of the decoding scheme based on MWPM
  • the black dots corresponding to RG are used to show the decoding performance and decoding time of the decoding scheme based on the RG algorithm
  • the black dots corresponding to CA are used to show the decoding performance and decoding time of the decoding scheme based on the CA algorithm
  • the black dots corresponding to MCMC are used to show the decoding performance and decoding time of the decoding scheme based on the MCMC algorithm
  • the black dots corresponding to MLD are used to show the decoding performance and decoding time of the decoding scheme based on the MLD algorithm
  • the black dots corresponding to NNbD are used to show the decoding performance and decoding time of the decoding scheme based on the neural network.
  • the decoding scheme based on the neural network can achieve good decoding performance for small-scale surface codes, and the decoding time used is relatively short.
  • This application proposes an end-to-end machine learning decoding method based on multi-task learning.
  • This method greatly improves the decoding performance without increasing the algorithm complexity (computational complexity O(L 3 ), decoding time O(log L)) and maintaining scalability, making it located in the area 30 shown by the dotted circle in the lower right corner of Figure 3, achieving the best decoding time and decoding performance.
  • its structure is also simpler - from O(L 2 ) models to O(1), and there is no need for communication between models, which is also easier to implement hardware deployment in engineering.
  • Figure 4 shows a schematic diagram of an application scenario of a solution provided by an embodiment of the present application.
  • the application scenario can be a superconducting quantum computing platform, and the application scenario includes: a quantum circuit 41, a dilution refrigerator 42, a control device 43 and a computer 44.
  • the quantum circuit 41 is a circuit that acts on a physical quantum bit, and the quantum circuit 41 can be realized as a quantum chip, such as a superconducting quantum chip near absolute zero.
  • the dilution refrigerator 42 is used to provide an absolute zero environment for the superconducting quantum chip.
  • the control device 43 is used to control the quantum circuit 41, and the computer 44 is used to control the control device 43.
  • the written quantum program is compiled into instructions by the software in the computer 44 and sent to the control device 43 (such as an electronic/microwave control system).
  • the control device 43 converts the above instructions into electronic/microwave control signals and inputs them into the dilution refrigerator 42 to control the superconducting quantum bit at 10mK.
  • the reading process is the opposite.
  • the quantum error correction decoding method based on neural network needs to be combined with the control device 43 (such as integrating the decoding algorithm into the electronic/microwave control system).
  • the master control system 43a such as the central board FPGA
  • the master control system 43a issues an error correction instruction to the error correction module 43b of the control device 43.
  • the error correction instruction includes the error symptom information of the above-mentioned quantum circuit 41.
  • the error correction module 43b can be an FPGA or ASIC chip.
  • the error correction module 43b runs the quantum error correction decoding algorithm based on neural network, decodes the error symptom information, and converts the decoded error result information into an error correction control signal in real time and sends it to the quantum circuit 41 for error correction.
  • S(P) is The part of the generators that are anti-commutative with P (also called the symptoms of the Pauli operator P), S(P) can be regarded as a bit array composed of 0 and 1.
  • the relationship between T(S(P)) and S(P) is one-to-one correspondence, which is called the simple representation of P.
  • a geometrically meaningful definition of the simple representation can be made, that is, the shortest Pauli operator that does not commute with P connecting the symptoms to the boundary.
  • Figure 6 shows a schematic diagram of the simple representation corresponding to the symptom point in the rotation surface code.
  • the dot a represents a single symptom point with a value of 1
  • the straight line 61 represents the X-type Pauli operator
  • the X-type Pauli operator is the shortest Pauli operator connecting the symptom point a to the boundary that is not commutative with P.
  • the dot b represents a single symptom point with a value of 1
  • the straight line 62 represents the Z-type Pauli operator, which is the shortest Pauli operator connecting the symptom point b to the boundary that is not commutative with P.
  • General topological error correction codes can have similar simple representation mappings.
  • L c (P) is a fixed representative element of the logical class to which L(P) belongs
  • P c is called the canonical representation of P
  • L c (P)T(S(P)) is called the canonical decomposition of P. All Pauli operators are converted to their equivalent canonical representations. This will greatly limit the unnecessary diversity of Pauli operators, especially when Pauli operators are selected as the output of the model, which will greatly reduce the difficulty of model training and thus improve the convergence speed of the training process.
  • the symptom measurement circuit can be shown in FIG7 .
  • Part (a) of FIG7 shows an eigenvalue measurement circuit for detecting a stable subgenerator of a Z error
  • part (b) of FIG7 shows an eigenvalue measurement circuit for detecting a stable subgenerator of an X error.
  • the order of action of the controlled not gate (CNOT) is very important and cannot be reversed, otherwise it will cause conflicts caused by different quantum gates using the same quantum bit.
  • all steps, including the controlled not gate, auxiliary state preparation and the final auxiliary state measurement will cause noise.
  • the controlled not gate will transmit errors
  • the two types of symptom measurements, X and Z are nested with each other.
  • the emission method shown in FIG7 can minimize the propagation of errors, making its impact on the error correction capability negligible. Other sequential emissions will greatly reduce the error correction capability.
  • Figure 8 shows a schematic diagram of a three-dimensional symptom distribution, with time in the vertical direction. It can be regarded as a three-dimensional data array composed of 0 and 1.
  • a total of 4 slices 81 are included, and each slice 81 represents the error symptom information obtained by a measurement.
  • Line 82 represents the symptoms caused by the Z error
  • line 83 represents the symptoms caused by the X error
  • line 84 represents the measurement error.
  • S is a 3-dimensional data array consisting of 0 and 1, representing error symptom information.
  • the most likely error on the two-dimensional data qubit can be inferred.
  • Corresponding operations are performed to correct the physical errors that occur on it.
  • the error E p does not need to be consistent with the actual error left on the physical qubit - as long as the weight of the difference between the two Small enough to ensure It can be corrected in the next round of error correction.
  • the classification results corresponding to each data qubit include the following four cases: I, X, Y, Z, where I means no error, X means X error, Z means Z error, and Y means both X error and Z error.
  • the present application provides two error decomposition methods.
  • the first is to decompose the regular expression into regular expressions. Due to the one-to-one correspondence between simple errors and symptoms, two A collection of symptoms The elements in the , L c can be described by and to determine. S X and S Z are called the regular symptoms of the decoded output. Taking S X as an example, it can be decomposed into:
  • the second method is to directly decompose the distribution of Pauli errors in physical quantum bits.
  • Physical quantum bits can be divided into non-intersection unions of different blocks:
  • E i is the marginal probability distribution of E i , represents the part of operators that E acts on outside block i, where i is an integer in the range [1, m]. Similar to the case of regular expression decomposition, the error can also be further decomposed into and Use approximate MAP for decoding respectively:
  • the selected segmentation method needs to ensure Or
  • MWPM only uses X(Z) type symptoms to decode Z(X) type errors. However, it is required to use all the symptom bits to decode both X and Z errors. This is because the errors in Z and X are related to each other, because their corresponding symptom bits are not completely independent. Considering all the symptoms together will more accurately determine the location of X and Z errors, which is not yet utilized by MWPM or other algorithms.
  • the purpose of using machine learning methods is nothing more than to approximate Pr(E i
  • the entire decoding process is shown in Figure 9.
  • the error symptom information S is decoded by m (m is greater than 1) neural network models respectively to obtain m groups of probability distributions, and then the error result information is determined based on the m groups of probability distributions.
  • the error result information indicates the data quantum bit where the error occurred and the corresponding error type.
  • this decoding method will be simplified to a 4-class classification problem, so only one network is needed to complete the decoding work. From a topological point of view, in the fault-tolerant scenario with measurement noise, the decoding problem cannot be reduced to a similar simple classification problem, so the fault-tolerant decoding using neural networks is much more complicated than the perfect symptom case.
  • FIG. 10 shows a flowchart of a quantum error correction decoding method based on a neural network provided by an embodiment of the present application.
  • the method can be applied to the control device of the application scenario shown in FIG. 4 .
  • the method can include at least one of the following steps 1010 to 1040:
  • Step 1010 obtaining error symptom information obtained by performing symptom measurement on the quantum circuit.
  • the error symptom information is a data array composed of the eigenvalues of the stable subgenerators of the quantum error correction code.
  • error symptom information is a data array consisting of the eigenvalues of the stable subgenerators of the quantum error correction code.
  • the error symptom information is a two-dimensional or three-dimensional data array consisting of 0 and 1. For example, when there is no error, the eigenvalue of the stable subgenerator is 0; when an error occurs, the eigenvalue of the stable subgenerator is 1.
  • the quantum error correction code as a surface code
  • errors and error symptoms have specific spatial locations: when an error causes a symptom, the eigenvalue of the auxiliary quantum bit at the corresponding position is 1 (it can be regarded as a point particle appearing at the position), and when there is no error, the eigenvalue of the auxiliary quantum bit at the corresponding position is 0. Therefore, for the surface code, if the error of the error correction process itself is not considered (that is, if the measurement process is perfect, it is called a perfect symptom), then the error symptom information can be considered as a two-dimensional data array composed of 0 and 1.
  • each round of symptom measurement can obtain error symptom information in the form of a two-dimensional data array, and multiple rounds of symptom measurements can obtain error symptom information in the form of a three-dimensional data array, as shown in FIG8 .
  • Step 1020 extract feature information from the error symptom information using a neural network decoder.
  • the control device when using a neural network decoder to extract feature information from error symptom information, can extract features from the error symptom information through the feature extraction network of the neural network decoder to obtain feature information; wherein the neural network decoder includes a feature extraction network and n feature decoding networks, and n is an integer greater than 1.
  • the neural network decoder is a machine learning model based on a neural network for decoding error symptom information.
  • the input data of the neural network decoder is the error symptom information
  • the output data is the error result information corresponding to the error symptom information.
  • the neural network decoder includes a feature extraction network and multiple feature decoding networks.
  • the feature extraction network is used to extract features from error symptom information to obtain feature information.
  • the feature information output by the feature extraction network will be input into multiple feature decoding networks respectively, and the multiple feature decoding networks will decode the feature information respectively to obtain decoding results corresponding to each feature decoding network.
  • the feature extraction network can be constructed based on CNN.
  • the feature decoding network can be based on FCN (Fully Connected Neural Network).
  • FCN Frully Connected Neural Network
  • the front end is a feature extraction network, which may include multiple cascaded feature extraction subnetworks and a feature fusion subnetwork.
  • the function of the feature extraction subnetwork is to extract local feature information using a divide and conquer approach, and the feature fusion subnetwork finally aggregates all local feature information and compresses it to obtain feature information.
  • the feature extraction subnetwork can be constructed based on CNN, such as each feature extraction subnetwork includes one or more convolutional layers.
  • the feature fusion subnetwork can be constructed based on a fully connected network, such as including one or two fully connected layers.
  • noise decomposition for example, Pr(E i
  • the size of the front-end and back-end networks depends on the specific situation. According to the current experimental conclusions, the size of the feature decoding network (such as using a fully connected layer) of the back-end can be independent of the error correction code scale L, and the number of model parameters of the back-end is proportional to O(L 2 ), and the calculation depth is O(1).
  • the overall computational complexity of the back-end is O(L 2 ). The front-end computational complexity and the complexity analysis of the overall algorithm will be explained below.
  • the second error decomposition method can provide end-to-end training, and use X-type and Z-type symptoms for decoding at the same time, providing a significant improvement in decoding performance.
  • a neural network decoder that is trained and inferred using the first error decomposition method is called a first-type decoder; a neural network decoder that is trained and inferred using the second error decomposition method is called a second-type decoder.
  • the feature extraction network of the neural network decoder uses the divide-and-conquer idea and adopts a block feature extraction method when extracting features from error symptom information. That is, each or part of the feature extraction subnetwork is used to perform block feature extraction on the input data.
  • the so-called block feature extraction means that when the feature extraction subnetwork extracts feature information, it blocks the input data into multiple small blocks and performs feature extraction on each small block. That is, block feature extraction means that after the input data is divided into at least two blocks, at least two feature extraction units are used to perform parallel feature extraction processing on the at least two blocks.
  • At least two blocks correspond to at least two feature extraction units one by one, each feature extraction unit is used to extract features from one block, and the number of blocks and feature extraction units is the same.
  • the above-mentioned at least two blocks are parallel when performing feature extraction, that is, they are performed simultaneously, which helps to reduce the time required for feature extraction.
  • the error symptom information is a three-dimensional symptom bit. After the three-dimensional symptom bit is divided into C 1 blocks, the first feature extraction subnetwork performs parallel feature extraction on the C 1 blocks to obtain C 2 blocks. Similarly, the second feature extraction subnetwork performs parallel feature extraction on the C 2 blocks to obtain C 3 blocks.
  • the kth feature extraction subnetwork performs parallel feature extraction on C k blocks to obtain C k+1 blocks, where k is a positive integer.
  • the feature fusion subnetwork fuses and compresses the above C k+1 blocks to obtain feature information as the input of the back end.
  • Step 1030 decode the feature information through the neural network decoder to obtain a decoding result.
  • the control device can decode the feature information through the n feature decoding networks respectively to obtain decoding results corresponding to the n feature decoding networks respectively; wherein the n feature decoding networks are trained using a multi-task learning method to have the ability to generate different decoding results.
  • step 1030 may include:
  • the feature information is decoded by n 1 feature decoding networks respectively, and decoding results corresponding to the n 1 feature decoding networks are obtained; wherein, for the i-th feature decoding network among the n 1 feature decoding networks, the decoding result corresponding to the i-th feature decoding network includes: the i-th regular symptom related to the target error type, i is a positive integer less than or equal to n 1 , Regular symptoms refer to the regular decomposition results of error symptom information;
  • the feature information is decoded by n 2 feature decoding networks respectively to obtain decoding results corresponding to the n 2 feature decoding networks respectively; wherein, for the j-th feature decoding network among the n 2 feature decoding networks, the decoding result corresponding to the j-th feature decoding network includes: a fixed representative element related to the target error type, j is a positive integer less than or equal to n 2 ; wherein, the sum of n 1 and n 2 is equal to n, and n 1 and n 2 are both positive integers.
  • the target error types include Pauli X errors and Pauli Z errors, n 1 is equal to the sum of m 1 and m 2 , m 1 and m 2 are both positive integers, and n 2 is equal to 2;
  • the m 1 feature decoding networks in the n 1 feature decoding networks are used to respectively decode the feature information to obtain m 1 regular symptoms related to the Pauli X error;
  • the m 2 feature decoding networks in the n 1 feature decoding networks are used to respectively decode the feature information to obtain m 2 regular symptoms related to the Pauli Z error;
  • a feature decoding network among the n 2 feature decoding networks is used to decode the feature information to obtain a fixed representative element related to the Pauli X error;
  • Another feature decoding network among the n 2 feature decoding networks is used to decode the feature information to obtain a fixed representative element related to the Pauli Z error.
  • the values of m1 and m2 may be the same or different.
  • the error symptom information includes Z-type symptom information and X-type symptom information, and the Z-type symptom information is decoded to obtain X-type error result information, which indicates the quantum bit that has a Pauli X error in the quantum circuit; the X-type symptom information is decoded to obtain Z-type error result information, which indicates the quantum bit that has a Pauli Z error in the quantum circuit.
  • the canonical symptom associated with the Pauli X error above can be expressed as Each Contains a portion of the Z-type symptom bits used for decoding to determine the type X error.
  • the canonical symptom associated with the Pauli Z error can be expressed as Each Contains a portion of the X-type symptom bits used for decoding to determine the Z-type error.
  • the above fixed representative element related to the Pauli X error can be expressed as
  • the above fixed representative element related to the Pauli Z error can be expressed as
  • the quantum bits contained in the quantum circuit are divided into n blocks, each block containing at least one quantum bit.
  • the decoding result corresponding to the kth feature decoding network includes: the Pauli operator acting on the quantum bits contained in the kth block among the n blocks, k is a positive integer less than or equal to n.
  • the kth block can be recorded as R k
  • the Pauli operator acting on the quantum bits contained in the kth block R k can be expressed as E k .
  • n feature decoding networks can obtain the Pauli operators E 1 , E 2 , ..., E n acting on the n blocks respectively.
  • Step 1040 Determine error result information of the quantum circuit according to the decoding result.
  • the error result information indicates a quantum bit in which an error occurs in the quantum circuit.
  • the above error result information may also indicate the error type corresponding to the qubit where an error occurs in the quantum circuit.
  • the control device can determine the error result information based on the decoding results corresponding to the n feature decoding networks respectively.
  • the quantum bits where errors occur in the quantum circuit and the corresponding error types can be determined. For example, the location of the data quantum bit where the error occurs in the quantum circuit and the error type of the data quantum bit where the error occurs at that location can be determined, such as whether it is a Pauli X error, a Pauli Z error, or both a Pauli X error and a Pauli Z error (i.e., a Pauli Y error).
  • step 1040 may include:
  • the type X error result information is determined, and the type X error result information indicates the quantum bit in the quantum circuit where the Pauli X error occurs; that is, according to and Determine the error result information of type X
  • the Z-type error result information indicates the quantum bit in the quantum circuit that has a Pauli Z error; that is, according to and Determine the Z error result information
  • two neural network decoders can be trained, which are recorded as a first neural network decoder and a second neural network decoder.
  • the feature extraction network of the first neural network decoder extracts the feature of the Z type error symptom information to obtain the first feature information, and then the m 1 +1 feature decoding networks of the first neural network decoder respectively decode the first feature information to obtain the decoding results corresponding to the m 1 +1 feature decoding networks, wherein the decoding results corresponding to the m 1 feature decoding networks include m 1 items of canonical symptoms related to Pauli X errors, and the decoding results corresponding to the other feature decoding network include fixed representative elements related to Pauli X errors.
  • the X type error result information is determined, and the X type error result information indicates the quantum bit in the quantum circuit where the Pauli X error occurs.
  • the feature extraction network of the second neural network decoder extracts the feature of the X-type error symptom information to obtain the second feature information, and then the m 2 +1 feature decoding networks of the second neural network decoder respectively decode the second feature information to obtain the decoding results corresponding to the m 2 +1 feature decoding networks, wherein the decoding results corresponding to the m 2 feature decoding networks include m 2 items of canonical symptoms related to the Pauli Z error, and the decoding results corresponding to the other feature decoding network include fixed representative elements related to the Pauli Z error.
  • the Z-type error result information is determined, and the Z-type error result information indicates the quantum bit in the quantum circuit where the Pauli Z error occurs.
  • the first neural network decoder and the second neural network decoder can both be trained using the multi-task learning method described above, and the number of feature decoding networks included in the first neural network decoder and the second neural network decoder can be the same or different, and this application does not limit this.
  • the error symptom information is decomposed into two parts, namely, the Z-type error symptom information and the X-type error symptom information, and the two parts are respectively input into the first neural network decoder and the second neural network decoder, which helps to reduce the computational complexity of the neural network decoder.
  • the error symptom information may not be decomposed, and the error symptom information may be directly input into the first neural network decoder and the second neural network decoder, and the first neural network decoder obtains the X-type error result information according to the error symptom information, and the second neural network decoder obtains the Z-type error result information according to the error symptom information.
  • step 1040 may include: determining error result information according to the Pauli operators acting on the n blocks respectively. That is, determining the error result information according to E 1 , E 2 , ..., En Specific details on the above determination For the principle, please refer to the description in the above embodiment.
  • the error result information indicates the quantum bits in the quantum circuit that have Pauli X errors and the quantum bits that have Pauli Z errors. That is, the error symptom information does not distinguish between the X-type error symptom information and the Z-type error symptom information, but combines and uses all the symptom bits to decode the X and Z errors at the same time. Correspondingly, the decoding result does not distinguish between the X-type error result information and the Z-type error result information, but directly decodes to obtain the error result information containing the X-type error and the Z-type error.
  • two neural network decoders may also be used for the second type of decoder, which are denoted as the first neural network decoder and the second neural network decoder.
  • the inputs of the first neural network decoder and the second neural network decoder are both error symptom information, and do not distinguish between type X error symptom information and type Z error symptom information, but are combined to use all symptom bits to decode X and Z errors at the same time.
  • the feature extraction network of the first neural network decoder extracts features of the error symptom information to obtain first feature information, and then the first feature information is decoded by the n feature decoding networks of the first neural network decoder to obtain first decoding results corresponding to the n feature decoding networks, wherein the first decoding result corresponding to the kth feature decoding network includes the Pauli operator related to the type X error acting on the kth block, and the type X error symptom is determined according to the first decoding results corresponding to the n feature decoding networks.
  • the result information of type X error indicates the quantum bit in which the Pauli X error occurs in the quantum circuit.
  • the feature extraction network of the second neural network decoder extracts the error symptom information to obtain the second feature information, and then the second feature information is decoded and processed by the n feature decoding networks of the second neural network decoder to obtain the second decoding results corresponding to the n feature decoding networks, wherein the second decoding result corresponding to the kth feature decoding network includes the Pauli operator related to the type Z error acting on the kth block, and the type Z error result information is determined according to the second decoding results corresponding to the n feature decoding networks, and the type Z error result information indicates the quantum bit in which the Pauli Z error occurs in the quantum circuit.
  • the first neural network decoder and the second neural network decoder can be trained by the multi-task learning method described above, and the number of feature decoding networks included in the first neural network decoder and the second neural network decoder can be the same or different, and this application does not limit this.
  • the output type of the neural network decoder is not limited.
  • a physical level output is used, and the model of the physical level output directly generates the specific quantum bit information where the error occurred, that is, which specific quantum bit has what type of error.
  • a logical level output is used, and the model of the logical level output outputs a logical error class after a specific error has been mapped, and then the equivalent error that occurred specifically on the quantum bit can be inferred based on this logical error class (this inferred error may not be the same as the original error, but the effect is the same, which is the error degeneracy phenomenon unique to quantum error correction codes).
  • the neural network decoder can use logical level output.
  • the measurement and collection process of new error symptom information is performed in parallel.
  • parallelizing symptom measurement and decoding it is not necessary to wait until O(L) symptom measurements are completed before starting decoding.
  • the corresponding calculation can be started.
  • decoding can be started during the subsequent symptom measurement, and the two are parallelized. This will reduce the overall delay time from the end of the last round of symptom measurement to the completion of error correction. In order to prevent the accumulation of errors during the error correction process, the shorter the delay time, the better.
  • the neural network decoder includes a feature extraction network and n feature decoding networks, which are deployed on the same chip.
  • the chip can be an FPGA or an ASIC.
  • the two neural network decoders can be deployed on the same chip or on two chips, which is not limited in this application.
  • the technical solution provided by the embodiment of the present application provides an error correction decoding solution based on a multi-task learning neural network model, by extracting corresponding feature information from the input error symptom information through a neural network decoder, and then decoding the feature information through a neural network decoder, outputting a decoding result of a local information distribution of noise decomposition, and then determining the error result information according to the decoding result.
  • the solution of the present application only requires a single neural network decoder to accurately determine the error result information, thereby fully improving the decoding performance and shortening the decoding time without increasing the complexity of the algorithm and maintaining scalability, and it is also easier to implement hardware deployment in engineering; for example, the above embodiment designs the neural network decoder as a structure including a feature extraction network and multiple feature decoding networks, the feature extraction network extracts corresponding feature information from the input error symptom information, and the feature information is simultaneously used as the input of multiple feature decoding networks, and the multiple feature decoding networks output the local information distribution of noise decomposition, and then the error result information is determined according to the decoding results of the multiple feature decoding networks.
  • the decoding performance is fully improved and the decoding time is shortened without increasing the complexity of the algorithm and maintaining scalability, and it is also easier to implement hardware deployment in engineering.
  • end-to-end reasoning and training can be provided by directly decomposing the Pauli errors according to their distribution in physical quantum bits.
  • the X and Z errors are interrelated, by using both X-type and Z-type symptoms for decoding and considering all symptom bits together, the positions of X and Z errors can be determined more accurately, providing a significant improvement in decoding performance.
  • the present application proposes to use LFEM (Local Feature Extraction Mapping) to compress the computational complexity.
  • LFEM Local Feature Extraction Mapping
  • the large-scale error correction code is regarded as multiple small-scale error correction codes (which can be called “small error correction codes"), and after the "small error correction code" is locally “decoded", the obtained information is summarized at a higher level for "decoding”. This process can be recursive until the final decoded information is the error that needs to be corrected.
  • the decoding of the "small error correction code" in a specific area at each layer can be called LFEM.
  • the feature extraction network includes multiple cascaded feature extraction sub-networks; wherein, the input data of the first feature extraction sub-network includes error symptom information, the input data of the sth feature extraction sub-network includes the output data of the s-1th feature extraction sub-network, and the output data of the last feature extraction sub-network includes feature information, and s is an integer greater than 1.
  • the input data of the target feature extraction subnetwork is divided into a plurality of input data blocks of the same scale.
  • the target feature extraction subnetwork may be any one of the plurality of cascaded feature extraction subnetworks.
  • the target feature extraction subnetwork is used to perform multiple local feature extraction mappings on a plurality of input data blocks to obtain a plurality of groups of mapping output data.
  • Each local feature extraction mapping is used to perform mapping processing on the regions at the same position in a plurality of input data blocks to obtain a group of mapping output data; different local feature extraction mappings are used to perform mapping processing on the regions at different positions in a plurality of input data blocks to obtain a plurality of groups of mapping output data.
  • the target feature extraction subnetwork is also used to obtain the output data of the target feature extraction subnetwork based on the plurality of groups of mapping output data.
  • FIG15 it exemplarily shows a schematic diagram of a feature extraction subnetwork performing local feature extraction mapping.
  • FIG15 shows the process of performing LFEM on two regions at different positions (distinguished by numbers 1 and 2 in the figure), and a single LFEM acts on the region at the same position in the input C i input data blocks. Different times of LFEM act on regions at different positions in the input C i input data blocks, as shown in the figure. Regions at two different positions 1 and 2.
  • the so-called overlap between regions at different locations refers to the existence of overlapping data between regions at different locations.
  • the overlap between the "small error correction codes" of each layer is small in the three-dimensional direction. In this way, the number of layers of local "error correction" increases to O(logL).
  • By setting the regions at different locations to overlap it helps to improve the decoding effect, because the same part of the information can be cross-validated after being acted on by two LFEMs to improve the decoding performance.
  • excessive overlap should not generate additional computational complexity. It should be noted that the parameters of the network itself involved in different LFEMs are consistent, but the areas acting on the input are different.
  • the target feature extraction subnetwork includes: at least one convolutional layer and at least one fully connected layer.
  • the at least one convolutional layer is used to perform multiple local feature extraction mappings on multiple input data blocks to obtain multiple sets of mapping output data.
  • the at least one fully connected layer is used to obtain output data of the target feature extraction subnetwork based on the multiple sets of mapping output data.
  • the simplest way is to use a single-layer 3D CNN, but the expressive power of this neural network is limited, and the decoding performance is greatly affected when the error correction code scale is large. Therefore, it is considered that after the 3D CNN kernels act on the same area of all input three-dimensional information blocks (a total of C i ) (the dashed boxes with the same label in Figure 15), they are connected to an FFN for further information integration and compression.
  • This FFN can contain fully connected layers of varying numbers. In actual situations, the maximum number of layers can be limited to 2.
  • the number of parameters of each feature extraction subnetwork is determined by the structure of its LFEM.
  • the FFN parameter is C i+1 K i . If C i , C i+1 , K i ⁇ O(1), the total number of front-end parameters is:
  • the number of parameters of the backend is O(L 2 ), so the total number of parameters is O(L 2 ).
  • the constant hidden under O is usually large, so when L is small, the actual parameters occupied by the front end may be more than the back end. Both this gradual increase and the actual parameters of the real model obtained through testing are acceptable in actual engineering.
  • the total computation time of the back end can be fully parallelized, and the size of the O(L 2 ) feature decoding networks in the back end is independent of L, the number of layers is O(1), the multiplication computation time is O(1), and the accumulation computation time is O(logC) ⁇ O(1). Therefore, the depth of the entire algorithm is O(logL).
  • This algorithm time is the shortest computation time that can be achieved in theory when there are sufficient computing resources.
  • LeakyReLU LeakyReLU
  • N + represents a set of positive integers.
  • the advantage of this is that the calculation of the excitation layer only needs to determine the sign bit and right shift a finite number of bits, which greatly simplifies the calculation implementation. Simulations show that LeakyReLU is sufficient to provide quite good results.
  • the training process of the neural network decoder is as follows:
  • the model After setting the model network structure, the model needs to be trained. Since multiple distribution functions need to be learned at the output end, the outputs of the n feature decoding networks are trained using the cross entropy loss function.
  • the input end is specified as the randomly generated symptom (it can be a single X or Z symptom, or it can contain both types of symptoms), and the output end is the output corresponding to the symptom, and one hot encoding is performed. Note that the input of the same symptom may correspond to different outputs. This diversity of output ends allows the model to eventually learn the output probability distribution based on specific input symptoms during the training process.
  • the loss function value corresponding to the i-th feature decoding network is determined according to the predicted decoding result corresponding to the i-th feature decoding network and the label decoding result corresponding to the i-th feature decoding network determined based on the sample error result information.
  • the loss function value corresponding to the i-th feature decoding network is used to measure the similarity between the predicted decoding result corresponding to the i-th feature decoding network and the label decoding result corresponding to the i-th feature decoding network.
  • the label decoding result corresponding to the i-th feature decoding network refers to the pre-set i-th feature decoding result.
  • the output label of the decoding network can also be understood as the expected output result.
  • the goal of training the i-th feature decoding network is to make its corresponding predicted decoding result as close as possible to the label decoding result.
  • the above method can be used to determine the loss function value corresponding to the feature decoding network.
  • the loss function values corresponding to the n feature decoding networks can be weighted and summed to obtain a total loss function value.
  • the total loss function value is used to characterize the performance of the entire neural network decoder.
  • the weight values corresponding to each feature decoding network can be the same or different, and this application does not limit this.
  • the gradient descent method can be used to minimize the total loss function value, calculate the parameter adjustment gradient of the neural network decoder, and adjust the parameters of the neural network decoder to be trained according to the parameter adjustment gradient to obtain the trained neural network decoder.
  • the parameters of the neural network decoder include the weight parameters of each neural network contained in the neural network decoder.
  • a third-party decoder is required to generate one-to-one corresponding input and output.
  • a natural choice is to use the MWPM decoder to generate a single output based on the simulation-generated symptoms. Without considering the computational complexity, it is also possible to consider using a better and more complex known decoder. In this case, the two types of symptoms can be separated and used separately for decoding of the two types of errors (because this is the input and output pattern generated by the MWPM), compressing the overall computational complexity with a small loss of computational complexity.
  • the canonical representation of the original error data generated by simulation is directly used as the multi-task learning output label, and X and Z symptoms are used as model input at the same time.
  • This allows the same input symptom in the training set to correspond to multiple output errors. This is because different results obtained by reasoning based on the physical bit distribution will at most cause local residual physical bit errors and will not constitute logical errors.
  • learning the physical error distribution rather than just maximum likelihood learning can achieve better decoding performance.
  • the classic Adam algorithm can be used for both decomposition methods, and the batch size can be greater than 1000.
  • the loss function corresponding to the feature decoding network is loss i (where i represents the i-th feature decoding network, which is a positive integer)
  • all loss functions can be added together to generate the total loss function:
  • Loss is the total loss function of the neural network decoder
  • loss i represents the loss function corresponding to the i-th feature decoding network
  • ⁇ i represents the weight value of the loss function corresponding to the i-th feature decoding network.
  • the Adam algorithm is used to perform gradient descent multi-task joint learning on the loss function Loss. In actual situations, all ⁇ i can be set to 1, or of course, other values can be set. In each training epoch, the learning rate is gradually reduced, or first increased and then reduced, depending on the actual situation.
  • the division of blocks is related to the correlation between errors, and quantum bits contained in the same block are prone to correlated errors.
  • the quantum bits contained in the same block mentioned above are prone to correlated errors, which means that the probability of correlated errors occurring between quantum bits contained in the same block is greater than the probability of correlated errors occurring between quantum bits contained in different blocks.
  • the probability of correlated Pauli X errors or Pauli Z errors occurring between quantum bits contained in the same block is greater than the probability of correlated Pauli X or Pauli Z errors occurring between quantum bits contained in different blocks.
  • the model can be trained using the original errors generated during the simulation rather than the indirect error data generated by other decoders.
  • the physical qubits can be split:
  • Ri can include as many possible local error correlations as possible.
  • the most typical error correlation is caused by the symptom measurement circuit.
  • the typical correlation errors generated by the symptom measurement circuit are two-body X and Z errors along the diagonal line.
  • the dotted box 161 in Figure 16 represents a two-body X error
  • the dotted box 162 represents a two-body Z error. Therefore, when dividing the bit area R, for the two types of errors, it is necessary to include as many of these diagonal distribution methods as possible.
  • each small white circle represents a physical qubit, and each physical qubit connected by a darkened black line belongs to the same block. From bottom to top, the number of
  • the quantum bits contained in the same block are more likely to produce correlated errors, which helps to further improve the decoding performance.
  • the quantum bits contained in the sample quantum circuit are divided using a plurality of different block division methods, and the neural network decoder is jointly trained based on the plurality of different block division methods.
  • the quantum bits contained in the quantum circuit are divided using one of the plurality of different block division methods.
  • the decoding effect will show the so-called error floor phenomenon in the low physical error rate range. That is, in the low physical error rate, the logical errors decrease very slowly as the physical errors decrease, and the decoding effect that the error correction code should have is lost.
  • the decoder neural network is placed at the back end by The multi-task network corresponding to each partition is trained jointly during the training phase. Specifically, the total loss function needs to be redefined:
  • the chip deploying the neural network decoder may adopt a single-core architecture or a multi-core architecture.
  • the single-core architecture and multi-core architecture here refer to the number of processors (or processing cores, cores) included.
  • the chip includes a single processor, and all the execution steps of the neural network decoder described in the above embodiment are completed by the single processor.
  • the computational complexity of the decoding method provided by the present application is O(L 3 ).
  • L is small, the single-core architecture can withstand the computational complexity; but when L is large, the single-core architecture will have limitations. Therefore, the present application proposes a solution of multiple architectures.
  • the chip includes multiple processors in a tree structure.
  • the number of processors included in the chip is not limited, and can be specifically designed in combination with the size of L or the computational complexity to complete the entire decoding algorithm while fully utilizing the computing power of each processor.
  • any two processors that are not connected have parallelism, which can maximize the computing power of each processor and shorten the decoding time.
  • any two processors that are connected can be executed sequentially.
  • Figure 19 shows a schematic diagram of a multi-core architecture.
  • Processors 1 to processors p may not have a connection relationship between each other, and the p processors can be executed in parallel, such as using a divide-and-conquer strategy to process different blocks in the error symptom information in parallel, and/or using LFEM to perform local feature extraction mapping on different input data blocks.
  • the feature data extracted by processors 1 to processor p is sent to processor p+1, and the processor p+1 processes the feature data provided by processors 1 to processor p to obtain feature information.
  • the feature information is input to processors p+2 to processor N respectively, and processors p+2 to processor N may not have a connection relationship between each other, and the multiple processors can be executed in parallel, such as each processor deploys a feature decoding network to decode the feature information and obtain the corresponding decoding result.
  • the error result information can be determined by a processor according to the decoding results corresponding to each feature decoding network.
  • the information to be processed can be sent to each processor at the same time so that the multiple processors can execute in parallel.
  • different blocks of error symptom information can be sent to each processor at the same time so that the p processors can execute in parallel.
  • feature information can be sent to each processor at the same time so that the multiple processors can execute in parallel.
  • a separate controller or processor can also be set to control the execution timing of each processor, such as controlling multiple parallel processors to start execution at the same time, and controlling serial processors to execute in sequence, so as to better coordinate the work of each processor and ensure the correctness and stability of the processing flow.
  • the neural network decoder based on multi-task learning provided by the present application has inherent parallelism in both the feature extraction part and the feature decoding part, it can be easily distributed to multiple different processors for execution. Moreover, the inputs of different processors are roughly independent, and almost no processor-to-processor communication is required. Only a small number of processors need to communicate to transmit data. This method is infinitely parallelizable. While making full use of each processor, the calculation scale can always be expanded by adding more processors to maintain O(logL) decoding delay.
  • L 5 (49 data bits and auxiliary bits in total).
  • the generation of quantum noise is simulated on the computer side and the noisy symptom measurement circuit is run.
  • the obtained symptoms 120 classical bits of information
  • the obtained symptoms 120 classical bits of information
  • the FPGA completes the decoding
  • the error information obtained from the decoding is transmitted to the computer side to determine whether the decoding result is successful.
  • the decoding performance of the FPGA is shown in Figure 22, and the overall decoding delay is 700ns.
  • Decoding performance can be greatly improved by using the second type of decoder - by using:
  • the decoder provided in this application can significantly improve the actual decoding capability with lower computational complexity, network complexity and computational depth.
  • FIG. 24 shows a block diagram of a quantum error correction decoding device based on a neural network provided by an embodiment of the present application.
  • the device has the function of implementing the above method example, and the function can be implemented by hardware or by hardware executing corresponding software.
  • the device can be a computer device or can be set in a computer device.
  • the device 2400 may include: a symptom acquisition module 2410, a feature extraction module 2420, a feature decoding module 2430 and a result determination module 2440.
  • the symptom acquisition module 2410 is used to obtain error symptom information obtained by performing symptom measurement on the quantum circuit;
  • a feature extraction module 2420 configured to extract feature information from the error symptom information using a neural network decoder; wherein the neural network decoder includes the feature extraction network and n feature decoding networks, where n is an integer greater than 1;
  • a feature decoding module 2430 is used to decode the feature information through the neural network decoder to obtain a decoding result
  • the result determination module 2440 is used to determine the error result information of the quantum circuit according to the decoding result.
  • the feature extraction module 2420 is used to extract features from the error symptom information through a feature extraction network of a neural network decoder to obtain feature information; wherein the neural network decoder includes the feature extraction network and n feature decoding networks, where n is an integer greater than 1.
  • the feature decoding module 2430 is used to decode the feature information respectively through the n feature decoding networks to obtain decoding results corresponding to the n feature decoding networks respectively; wherein the networks of the n feature decoding networks are trained using a multi-task learning method to have a network capable of generating different decoding results.
  • the result determination module 2440 is used to determine error result information according to the decoding results corresponding to the n feature decoding networks.
  • the quantum bits contained in the quantum circuit are divided into n blocks, each block containing at least one quantum bit.
  • the kth feature decoding network among the n feature decoding networks the kth feature The decoding result corresponding to the decoding network includes: the Pauli operator acting on the quantum bits contained in the kth block among the n blocks, where k is a positive integer less than or equal to n.
  • the result determination module 2440 is used to determine the error result information according to the Pauli operators acting on the n blocks respectively.
  • the error result information indicates the quantum bits in the quantum circuit that have Pauli X errors and the quantum bits that have Pauli Z errors.
  • the division of the blocks is related to the correlation between errors, and the quantum bits contained in the same block are prone to correlated errors.
  • multiple different block partitioning methods are used to partition the quantum bits included in the sample quantum circuit, and the neural network decoder is jointly trained based on the multiple different block partitioning methods.
  • one of the multiple different block partitioning methods is used to partition the quantum bits included in the quantum circuit.
  • the feature decoding module 2430 is used to:
  • the feature information is decoded by n 1 feature decoding networks respectively, and decoding results corresponding to the n 1 feature decoding networks are obtained; wherein, for the i-th feature decoding network among the n 1 feature decoding networks, the decoding result corresponding to the i-th feature decoding network includes: the i-th regular symptom related to the target error type, i is a positive integer less than or equal to n 1 , and the regular symptom refers to the regular decomposition result of the error symptom information;
  • the feature information is decoded by n 2 feature decoding networks respectively to obtain decoding results corresponding to the n 2 feature decoding networks respectively; wherein, for the j-th feature decoding network among the n 2 feature decoding networks, the decoding result corresponding to the j-th feature decoding network includes: a fixed representative element related to the target error type, where j is a positive integer less than or equal to n 2 ;
  • n1 and n2 are both positive integers.
  • the target error types include Pauli X errors and Pauli Z errors, n 1 is equal to the sum of m 1 and m 2 , m 1 and m 2 are both positive integers, and n 2 is equal to 2;
  • n 1 feature decoding networks in the n 1 feature decoding networks are used to respectively decode the feature information to obtain m 1 regular symptoms related to the Pauli X error;
  • the m2 feature decoding networks in the n1 feature decoding networks are used to respectively decode the feature information to obtain m2 canonical symptoms related to the Pauli Z error;
  • a feature decoding network among the n2 feature decoding networks is used to decode the feature information to obtain a fixed representative element related to the Pauli X error;
  • Another feature decoding network among the n2 feature decoding networks is used to decode the feature information to obtain a fixed representative element related to the Pauli Z error;
  • the result determination module 2440 is used to:
  • type X error result information Determining type X error result information according to a fixed representative element associated with the Pauli X error and m 1 canonical symptoms associated with the Pauli X error, wherein the type X error result information indicates a quantum bit in the quantum circuit where the Pauli X error occurs;
  • type Z error result information is determined, wherein the type Z error result information indicates a quantum bit in the quantum circuit where the Pauli Z error occurs.
  • the feature extraction network includes a plurality of cascaded feature extraction subnetworks; wherein the input data of the first feature extraction subnetwork includes the error symptom information, the input data of the sth feature extraction subnetwork includes the output data of the s-1th feature extraction subnetwork, and the output data of the last feature extraction subnetwork includes the feature information, and s is an integer greater than 1;
  • input data of the target feature extraction subnetwork is divided into multiple input data blocks of the same scale
  • the target feature extraction subnetwork is used to perform multiple local feature extraction mappings on the multiple input data blocks to obtain multiple groups of mapping output data; wherein each local feature extraction mapping is used to perform mapping processing on the regions at the same position in the multiple input data blocks to obtain a group of mapping output data; different local feature extraction mappings are used to perform mapping processing on the regions at different positions in the multiple input data blocks to obtain the multiple groups of mapping output data;
  • the target feature extraction subnetwork is also used to obtain output data of the target feature extraction subnetwork according to the multiple groups of mapping output data.
  • the target feature extraction subnetwork includes: at least one convolutional layer and at least one fully connected layer;
  • the at least one convolutional layer is used to perform the multiple local feature extraction mappings on the multiple input data blocks to obtain the multiple groups of mapping output data;
  • the at least one fully connected layer is used to obtain output data of the target feature extraction subnetwork according to the multiple groups of mapping output data.
  • a measurement and collection process of new error symptom information is performed in parallel.
  • the training process of the neural network decoder is as follows:
  • the parameters of the neural network decoder to be trained are adjusted according to the total loss function value to obtain the trained neural network decoder.
  • the feature extraction network and the n feature decoding networks included in the neural network decoder are deployed on the same chip.
  • the chip deploying the neural network decoder includes multiple processors in a tree structure, and any two processors that are not connected have parallelism.
  • the device provided in the above embodiment when implementing its functions, only uses the division of the above functional modules as an example.
  • the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the device and method embodiments provided in the above embodiment belong to the same concept, and their specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • FIG 25 shows a schematic diagram of the structure of a computer device provided in one embodiment of the present application.
  • the computer device may be the control device 43 in the application scenario of the solution shown in Figure 4.
  • the computer device may be used to implement the quantum error correction decoding method based on a neural network provided in the above embodiment. Specifically:
  • the computer device 2500 includes a processing unit 2501 (such as a CPU and/or a GPU), a system memory 2504 including a random access memory (RAM) 2502 and a read-only memory (ROM) 2503, and a system bus 2505 connecting the system memory 2504 and the processing unit 2501.
  • the computer device 2500 also includes a basic input/output system (I/O (Input/Output) system) 2506 for facilitating information transmission between various components in the computer, and a large-capacity storage device 2507 for storing an operating system 2513, application programs 2514, and other program modules 2515.
  • I/O Input/Output
  • the basic input/output system 2506 includes a display 2508 for displaying information and an input device 2509 such as a mouse and a keyboard for user inputting information.
  • the display 2508 and the input device 2509 are connected to
  • the input/output controller 2510 of the system bus 2505 is connected to the processing unit 2501.
  • the basic input/output system 2506 may also include an input/output controller 2510 for receiving and processing inputs from a plurality of other devices such as a keyboard, a mouse, or an electronic stylus.
  • the mass storage device 2507 is connected to the processing unit 2501 through a mass storage controller (not shown) connected to the system bus 2505.
  • the mass storage device 2507 and its associated computer readable medium provide non-volatile storage for the computer device 2500. That is, the mass storage device 2507 may include a computer readable medium (not shown) such as a hard disk or a CD-ROM (Compact Disc Read-Only Memory) drive.
  • the computer-readable medium may include computer storage media and communication media.
  • Computer storage media include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information such as computer-readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash memory or other solid-state memory technology, CD-ROM, DVD (Digital Video Disc) or other optical storage, cassettes, tapes, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable Programmable Read-Only Memory
  • EEPROM Electrical Erasable Programmable Read-Only Memory
  • flash memory or other solid-state memory technology
  • CD-ROM Compact Disc
  • DVD Digital Video Disc
  • the computer device 2500 can also be connected to a remote computer on the network through a network such as the Internet. That is, the computer device 2500 can be connected to the network 2512 through the network interface unit 2511 connected to the system bus 2505, or the network interface unit 2511 can be used to connect to other types of networks or remote computer systems (not shown).
  • the memory stores a computer program, which is configured to be executed by one or more processors to implement the neural network-based quantum error correction decoding method provided in the above embodiment.
  • a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor of a computer device, the neural network-based quantum error correction decoding method provided in the above embodiment is implemented.
  • the computer-readable storage medium can be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • a computer program product is also provided.
  • the computer program product When executed, it is used to implement the neural network-based quantum error correction decoding method provided in the above embodiment.
  • a chip which includes a programmable logic circuit and/or program instructions.
  • the chip runs on a computer device to implement the neural network-based quantum error correction decoding method provided in the above embodiment.
  • the chip is an FPGA chip or an ASIC chip.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

一种基于神经网络的量子纠错解码方法、装置、设备及芯片,涉及人工智能和量子技术领域。方法包括:获取对量子电路进行症状测量得到的错误症状信息(1010);使用神经网络解码器从错误症状信息中提取特征信息(1020);通过神经网络解码器对特征信息进行解码处理,得到解码结果(1030);根据解码结果,确定错误结果信息(1040)。上述方案充分地提升了解码性能并缩短了解码时间,且在工程上也较容易实现硬件部署。

Description

基于神经网络的量子纠错解码方法、装置、设备及芯片
本申请要求于2022年11月22日提交的、申请号为202211468927.9、发明名称为“基于神经网络的量子纠错解码方法、装置、设备及芯片”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及人工智能和量子技术领域,特别涉及一种基于神经网络的量子纠错解码方法、装置、设备及芯片。
背景技术
真实量子计算中的所有操作过程,包括量子门和量子测量都带有噪声。也就是说即使用来做量子纠错的电路本身也含有噪声。
对于容错量子纠错,通过对量子电路进行症状测量得到相应的错误症状信息,然后对该错误症状信息进行解码,以确定出量子电路中发生错误的量子比特以及相应的错误类型。在相关技术中,提供了一些对错误症状信息进行解码的方案,如基于MWPM(Minimum Weight Perfect Matching,最小权重完美匹配)的解码方案、基于RG(Renormalization Group,重整化群)算法的解码方案、基于CA(Celluar Automaton,元胞自动机)的解码方案、基于神经网络的解码方案,等等。目前,基于神经网络的解码方案,在解码能力上仍存在一些不足。
发明内容
本申请实施例提供了一种基于神经网络的量子纠错解码方法、装置、设备及芯片。所述技术方案如下:
根据本申请实施例的一个方面,提供了一种基于神经网络的量子纠错解码方法,所述方法由控制设备执行,所述方法包括:获取对量子电路进行症状测量得到的错误症状信息;通过神经网络解码器从所述错误症状信息中提取特征信息;通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果;根据所述解码结果,确定所述量子电路的错误结果信息。
根据本申请实施例的一个方面,提供了一种基于神经网络的量子纠错解码装置,所述装置包括:
症状获取模块,用于获取对量子电路进行症状测量得到的错误症状信息;
特征提取模块,用于通过神经网络解码器从所述错误症状信息中提取特征信息;
特征解码模块,用于通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果;
结果确定模块,用于根据所述解码结果,确定所述量子电路的错误结果信息。
根据本申请实施例的一个方面,提供了一种计算机设备,包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现上述方法。
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序由处理器加载并执行以实现上述方法。
根据本申请实施例的一个方面,提供了一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序由处理器加载并执行以实现上述方法。
根据本申请实施例的一个方面,提供了一种芯片,所述芯片部署有神经网络解码器,所述神经网络解码器用于实现上述方法。
本申请实施例提供了一种基于多任务学习神经网络模型的纠错解码方案,通过将神经网 络解码器从输入的错误症状信息中提取相应的特征信息,再通过神经网络解码器对该特征信息进行解码,输出噪声分解的局部信息分布的解码结果,然后根据解码结果确定错误结果信息,相比于采用多个神经网络解码器的方案,本申请方案只需要单个神经网络解码器即可以准确的确定错误结果信息,从而在不增加算法复杂度且保持可扩展性的前提下,充分地提升了解码性能并缩短了解码时间,且在工程上也较容易实现硬件部署。
附图说明
图1是本申请一个实施例示出的旋转表面码的示意图;
图2是本申请一个实施例示出的表面码错误发生的示意图;
图3是本申请一个实施例示出的几种解码方案的解码性能和解码时间比较的示意图;
图4是本申请一个实施例提供的方案应用场景的示意图;
图5是图4所示方案应用场景所涉及的纠错解码过程的示意图;
图6是本申请一个实施例提供的旋转表面码中的症状点对应的简单表示的示意图;
图7是本申请一个实施例提供的症状测量电路的示意图;
图8是本申请一个实施例提供的三维症状分布的示意图;
图9是本申请一个实施例提供的采用多个神经网络模型进行解码的架构图;
图10是本申请一个实施例提供的基于神经网络的量子纠错解码方法的流程图;
图11是本申请一个实施例提供的基于多任务学习的神经网络解码器的架构图;
图12是本申请一个实施例提供的第一类解码器的解码过程的示意图;
图13是本申请一个实施例提供的第二类解码器的解码过程的示意图;
图14是本申请另一个实施例提供的第二类解码器的解码过程的示意图;
图15是本申请一个实施例提供的特征提取子网络进行局域特征提取映射的示意图;
图16是本申请一个实施例提供的症状测量电路所产生的关联噪声的示意图;
图17是本申请一个实施例提供的物理量子比特划分的示意图;
图18是本申请另一个实施例提供的物理量子比特划分的示意图;
图19是本申请一个实施例提供的多核架构的示意图;
图20是本申请一个实施例提供的第一种错误分解方式对应的实验结果数据的示意图;
图21是本申请一个实施例提供的芯片部署的示意图;
图22是本申请另一个实施例提供的第一种错误分解方式对应的实验结果数据的示意图;
图23是本申请一个实施例提供的第二种错误分解方式对应的实验结果数据的示意图;
图24是本申请一个实施例提供的基于神经网络的量子纠错解码装置的框图;
图25是本申请一个实施例提供的计算机设备的结构示意图。
具体实施方式
在对本申请实施例进行介绍说明之前,首先对本申请中涉及的一些名词进行解释说明。
1、量子计算(Quantum Computation,QC):利用量子态的叠加和纠缠性质快速完成特定计算任务的方式。
2、量子纠错码(Quantum Error Correction,QEC):将量子态映射到多体量子系统希尔伯特空间中的一个子空间中编码的方式。量子噪声会将被编码的量子态迁移到其他子空间。通过持续观测量子态所在的空间(症状提取),可以做到评估并纠正量子噪声的同时,不干扰编码的量子态,从而保护编码的量子态不受量子噪声的干扰。具体来说,一个[[n,k,d]]量子纠错码代表在n个物理量子比特中编码k个逻辑量子比特,就用来纠正任意个发生在任意单量子比特上的错误。
3、数据量子态:用来在量子计算时存贮量子信息的数据量子比特的量子态。
4、稳定子生成元(stabilizer generator):也称奇偶校验算子。量子噪声(错误)的发生会 改变某些稳定子生成元的本征值,从而可以根据这些信息进行量子纠错。
5、稳定子群(stabilizer group):稳定子群是由稳定子生成元生成的群。例如,稳定子生成元生成的阿贝尔群称为稳定子生成群。如果有k个稳定子生成元,那么稳定子群中就包括2k个元素,这是一个可交换群(Abelian group)。
6、错误症状(error syndrome):当没有错误时,稳定子生成元的本征值为0;当量子噪声发生时,某些与错误反对易的纠错码的稳定子生成元(奇偶校验算子)的本征值会变成1。这些0,1症状比特构成的比特串称为错误症状。
7、拓扑量子纠错码(topological quantum code):量子纠错码中的一种特殊类别。这类纠错码的量子比特分布为在大于2维的格子阵列上。格子构成一个高维流形的离散结构。此时纠错码的稳定子生成元定义在几何上近邻且有限的量子比特上,所以在几何上是定域,在物理上是容易测量的。这类纠错码的逻辑算子作用的量子比特在格点阵列的流形上构成一类拓扑非平凡的几何对象。
8、表面码(surface code):表面码是一类定义在二维流形上的拓扑量子纠错码。它的稳定子生成元通常由4个量子比特支撑(在边界处由2个量子比特支撑),逻辑算子为条状跨越阵列的非平庸链。表面码具体的二维结构(7×7,包括49个数据量子比特和48个辅助量子比特,总共97个物理量子比特,可以纠正任意发生在两个量子比特上的错误)如图1所示:黑色圆点11代表用来做量子计算的数据量子比特,十字12代表辅助量子比特。辅助量子比特初始制备在|0>或者|+>态。斜线填充和白色填充的方块代表两种不同类型的稳定子生成元,分别用于检测Z错误和X错误。
9、面码尺度L:表面码阵列周长的四分之一。图1中的表面码阵列L=7,包括49个数据量子比特和48个辅助量子比特,总共97个物理量子比特。
10、X和Z错误:在物理量子比特的量子态上产生的由随机产生的泡利X和泡利Z演化错误。根据量子纠错理论,如果纠错码可以纠正X和Z错误,那就可以纠正任意发生在单量子比特上的错误。
11、容错量子纠错(Fault-tolerant Quantum Error Correction,FTQEC):真实量子计算中的所有操作过程,包括量子门和量子测量都带有噪声。也就是说即使用来做量子纠错的电路本身也含有噪声。容错量子纠错是指可以通过巧妙设计纠错电路后可以使用带有噪声的纠错电路进行纠错,并且仍然能达到纠正错误并阻止错误随时间扩散的目的。
12、容错量子计算(Fault Tolerant Quantum Computation,FTQC):在量子计算的过程中,任意物理操作都带有噪声,包括量子纠错电路本身和量子比特测量。如果假设经典操作(比如指令输入、纠错码解码等)不含噪声,那么容错量子计算是通过合理地设计量子纠错方案,并对编码的逻辑量子态进行特定方式量子门操作,来保证使用带噪声量子比特进行量子计算的过程中,可以有效控制并纠正错误的技术方案。
13、物理量子比特:使用真实物理器件实现的量子比特。
14、逻辑量子比特:在纠错码定义的希尔伯特子空间中的数学自由度。其量子态的描述通常是多体纠缠态,一般为多个物理量子比特联合希尔伯特空间的二维子空间。容错量子计算需要运行在受到纠错码保护的逻辑量子比特上。
15、量子门/电路:作用在物理量子比特上的量子门/电路。
16、阈值定理(threshold theorem):对符合容错量子计算要求的量子计算方案,当所有操作的错误率低于某个阈值时,可以通过使用更好的纠错码,更多的量子比特,更多的量子操作,使计算的正确率可以任意逼近到1,同时这些额外的资源开销相对于量子计算的指数加速可以忽略不计。
17、神经网络:人工神经网络是由大量的简单基本元件——神经元相互联接而成的自适应非线性动态系统。每个神经元的结构和功能比较简单,但大量神经元组合产生的系统行为却非常复杂,原则上可以表达任意函数。
18、卷积神经网络(Convolutional Neural Network,CNN):卷积神经网络是一类包含卷积计算且具有深度结构的前馈神经网络。卷积层(convolutional layer)是卷积神经网络的核心基石,即离散二维或者三维滤波器(也称作卷积核,分别为二维或者三维矩阵)与二维或者三维数据点阵进行卷积操作。
19、线性整流层(Rectified Linear Units layer,ReLU layer):使用线性整流(Rectified Linear Units,ReLU)f(x)=max(0,x)作为神经网络的激励函数。
20、修正线性蒸馏层(Leaky Rectified Linear Unit Layer,LeakyReLU Unit)是一种基于ReLU的激活函数,但它对于负值的斜率很小,而不是平缓的斜率。
21、误差反向传播算法(Back Propagation,BP):是人工神经网络中的一种监督式的学习算法。BP神经网络算法在理论上可以逼近任意函数,基本的结构由非线性变化单元组成,具有很强的非线性映射能力。
22、现场可编程门阵列芯片(Field Programmable Gate Array,FPGA):是在PAL(Programmable Array Logic,可编程阵列逻辑)、GAL(Generic Array Logic,通用阵列逻辑)等可编程器件的基础上进一步发展的产物。它是作为专用集成电路(Application Specific Integrated Circuit,ASIC)领域中的一种半定制电路而出现的,既解决了定制电路的不足,又克服了原有可编程器件门电路数有限的缺点。
23、专用集成电路(Application Specific Integrated Circuit,ASIC):是指应特定用户要求和特定电子系统的需要而设计、制造的集成电路。用CPLD(Complex Programmable Logic Device,复杂可编程逻辑器件)和FPGA来进行ASIC设计是最为流行的方式之一,它们的共性是都具有用户现场可编程特性,都支持边界扫描技术,但两者在集成度、速度以及编程方式上具有各自的特点。
24、单磁通量子(Single Flux Quantum,SFQ)电路:又叫RSFQ(Rapid Single Flux Quantum,快速单磁通量子)电路,是由约瑟夫森结(Josephson Junction,JJ)组成的电路,通过磁通量子的有无来表示“1”,“0”。电路中用“X”来表示一个约瑟夫森结。上下2层由超导体构成,中间由非常薄的一层绝缘体构成。可用于数字逻辑计算。
25、多任务学习(Multi-task learning):在本申请中,定义为利用同一个神经网络模型同时完成多个分类任务。使多个神经网络组成一个大的神经网络,并尽可能共用一部分神经网络,从而达到减少整体计算复杂度和空间复杂度的目的。
26、正则泡利算子(Canonical Pauli Operator):针对某个特定的稳定子码,等价的一类泡利算子可以规定其中一个代表泡利算子来表示。这个代表泡利算子可以称为该算子类的正则泡利算子。
27、Adam(Adaptive moment estimation,自适应矩估计):即一种对随机目标函数执行一阶梯度优化的算法,该算法基于适应性低阶矩估计。Adam算法很容易实现,并且有很高的计算效率和较低的内存需求。
本申请实施例提供的方案涉及人工智能的机器学习技术在量子技术领域的应用,具体涉及机器学习技术在量子纠错码的解码算法中的应用,具体通过如下实施例进行说明。
由于量子比特非常容易受到噪声的影响,所以要直接在物理量子比特上实现量子计算,以目前的技术来看还不现实。量子纠错码和容错量子计算技术的发展,原则上提供了在有噪声量子比特上实现任意精度量子计算的可能。一般来说,对量子纠错码的稳定子生成元进行测量(也称为对量子比特的奇偶校验)需要引入远距离量子门,并同时要求通过额外的量子比特来制备复杂的量子辅助态来完成容错纠错。由于目前实验手段的限制,人们现在还没有能力实现高精度的远距离量子门,也还没有能力制备复杂的量子辅助态。而使用表面码做容错量子纠错和容错量子计算方案并不需要使用远距离量子门和制备复杂的量子辅助态,所以被认为极有可能利用当前的技术来实现通用容错量子计算机。
作为一个纠错码,当错误发生后,可以通过奇偶校验来得到错误症状(error syndrome);然后根据这些症状,还需要通过具体的针对纠错码的解码算法来判断错误发生的位置和类型(是X错误,Z错误,还是两者皆有,即Y错误)。对表面码来说,错误和错误症状是有具体的空间位置的:当有错误引起症状时,对应位置的辅助量子比特的本征值为1(可以看成是该位置出现了一个点粒子),没有错误时,对应位置的辅助量子比特的本征值为0。这时,解码可以归结为以下问题:给定一个空间数字阵列(2维或者3维,数值为0或者1),根据特定的错误发生模型(error model)——发生在量子比特上的错误概率分布——推理最大可能发生错误的是哪些量子比特,及具体的错误类型,并根据这个推理结果进行错误纠正。
如图2所示,其示出了表面码错误发生的示意图。量子比特在二维阵列的边上,测量错误症状的辅助量子比特在二维阵列的节点上(这些症状为完美测量)。图2中黑色边21代表发生错误的量子比特形成的错误链,斜线填充的圆圈部分22代表由错误引发的症状值为1的点。只要能通过点状症状确定链状错误即可完成解码。
在上文已经介绍,采用纠错码的解码算法(也可以称为解码器)对错误症状信息进行解码,即可得到相应的错误结果信息,如包括错误发生的位置和类型。一个解码器的解码能力可以从如下几个关键指标来衡量:解码算法复杂度、解码时间、解码性能、能否适合实时纠错,以及工程实现难易度。
解码算法复杂度:指运算解码算法需要的总的基本计算步骤,对应计算复杂性,复杂性越高,需要的计算量就越大。
解码时间:这里的时间是一个抽象的概念,与真实解码时间有所区别,但有很强的关联。这里指的是对解码算法充分并行化后的算法深度。这个深度决定了真实运行解码算法的时间下限,即最大并行化之后所需要的算法运行时间。
解码性能:根据某个特定的噪声模型解码纠错之后,用发生在逻辑量子比特上的错误率来衡量。对同样的物理量子比特错误率,逻辑错误率越低,解码性能越好。
适合实时纠错:由于量子比特寿命较短(举例来说,超导量子比特的寿命在较好的工艺下大约在150微秒左右,在多轮症状测量之后,根据这些症状进行实时解码纠错,而在解码过程中,系统处于idle状态(空闲状态),错误随着时间增长在逐步积累,理论要求整个纠错过程消耗时间小于超导量子比特寿命的1/1000-1/100,也就是说,整个纠错时间的刚性裕度为150ns-1500ns左右,不然错误发生率可能会超出表面码的纠错能力),CPU(Central Processing Unit,中央处理器)和GPU(Graphics Processing Unit,图形处理器)有内存读取和写入时间的不确定性,缓存命中的不确定性,以及分支跳转等问题,这会导致较长时间的延迟,无法满足需求,且CPU/GPU的计算微结构并没有针对解码算法进行优化,为了通用性而所以无法达到性能指标。本申请考虑将解码算法移植到特定计算设备如FPGA或者ASIC上。而这类设备更适合运行简单的步骤以并行化(例如向量内积,矩阵乘法等),并不适合运行复杂的带有条件判定和跳转的指令。
工程实现难易度:是指工程上是否容易对解码器进行硬件部署。即时解码算法的理论时间复杂度较低,但是实际上或者控制较为复杂,或者实际计算量仍然很大,需要多块计算器件配合并行计算。这其中芯片间的通信造成的延迟甚至可能大于计算本身的延迟,在实际实时解码中是不可接受的。所以一个容易工程实现的算法需要真正减少计算量,以减少使用计算设备的数目以及计算设备(芯片)间的通信。另外,由于FPGA/ASIC的大部分芯片面积被用于做实时计算,可以预留的片上高速缓存有限,这就要求不能过多地在片上预载入数据。具体到神经网络解码器,这要求所能使用的网络参数不能太多,随纠错码尺度增长也不能过快,能预置于芯片的片上内存方便读取。
目前已知的一些量子纠错解码方案包括基于MWPM的解码方案、基于RG算法的解码方案、基于CA的解码方案、基于MCMC(Monte Carlo Markov Chain,蒙特卡罗马尔可夫链)的解码方案、基于MLD(Maximum Likelihood Decoding,最大似然解码)的解码方案、基于 NN(Neural Network,神经网络)的解码方案,等等。关于各个解码方案的解码性能和解码时间,图3给出了一个较为粗略的比较。在图3中,MWPM对应的黑点用于示出基于MWPM的解码方案的解码性能和解码时间,RG对应的黑点用于示出基于RG算法的解码方案的解码性能和解码时间,CA对应的黑点用于示出基于CA算法的解码方案的解码性能和解码时间,MCMC对应的黑点用于示出基于MCMC算法的解码方案的解码性能和解码时间,MLD对应的黑点用于示出基于MLD算法的解码方案的解码性能和解码时间,NNbD对应的黑点用于示出基于神经网络的解码方案的解码性能和解码时间。总体来说,从图3也可以看出,基于神经网络的解码方案对小尺度表面码可以达到很好的解码性能,同时使用的解码时间也相对较短。
本申请提出了一种基于多任务学习的端到端机器学习解码方法。该方法在不增加算法复杂度(计算复杂度O(L3),解码时间O(log L)),保持可扩展性的前提下,极大地提升了解码性能,使其位于图3右下角虚线圆圈所示区域30,达到解码时间和解码性能的同时最优。同时,其结构也更为简单——从O(L2)个模型减少到O(1),且模型间不需要通信,在工程上也更容易实现硬件部署。
请参考图4,其示出了本申请一个实施例提供的方案应用场景的示意图。如图4所示,该应用场景可以是超导量子计算平台,该应用场景包括:量子电路41、稀释制冷机42、控制设备43和计算机44。
量子电路41是一种作用在物理量子比特上的电路,量子电路41可以实现成为量子芯片,如处于绝对零度附近的超导量子芯片。稀释制冷机42用于为超导量子芯片提供绝对零度的环境。
控制设备43用于对量子电路41进行控制,计算机44用于对控制设备43进行控制。例如,编写好的量子程序经过计算机44中的软件编译成指令发送给控制设备43(如电子/微波控制系统),控制设备43将上述指令转换为电子/微波控制信号输入到稀释制冷机42,控制处于10mK的超导量子比特。读取的过程则与之相反。
如图5所示,本申请实施例提供的基于神经网络的量子纠错解码方法需要与控制设备43结合(如将该解码算法集成到电子/微波控制系统中),当控制设备43的总控系统43a(如中央板FPGA)从量子电路41读取到错误症状信息之后,总控系统43a发出纠错指令到控制设备43的纠错模块43b,该纠错指令中包括上述量子电路41的错误症状信息,纠错模块43b可以是FPGA或ASIC芯片,纠错模块43b运行基于神经网络的量子纠错解码算法,对错误症状信息进行解码,并实时将解码得到的错误结果信息转换成纠错控制信号发送给量子电路41进行错误纠正。
为了方便后面的介绍说明,这里先介绍本申请提出的几个基本概念。
1.泡利算子的正则表示
给定任意泡利算子P,以及纠错码(在本申请中以旋转表面码为例,任意拓扑纠错码可以以类似的方式定义)的稳定子群的生成元则作用在支持纠错码的物理量子比特上的泡利算子P可以拆解为:
P=L(P)T(S(P))
其中S(P)是中与P反对易的那部分生成元(也称为泡利算子P的症状),S(P)可以看作是一个由0、1构成的比特阵列。T(S(P))则是根据这部分生成元映射产生的一个泡利算子。需要说明的是,在量子力学中,如果算符F和算符G满足FG=GF,则称算符F和算符G对易;如果算符F和算符G满足FG=-GF,则称算符F和算符G反对易。T(S(P))和S(P)的关系是一一对应,称为P的简单表示。对旋转表面码而言,可以对简单表示做一个有几何意义的定义,即连接症状到边界的最短与P不对易的泡利算子。如图6所示,其示出了旋转表面码中的症状点对应的简单表示的示意图,在图6中,圆点a表示单个值为1的症状点,直线61代表X类 泡利算子,该X类泡利算子为连接症状点a到边界的最短与P不对易的泡利算子。类似地,圆点b表示单个值为1的症状点,直线62代表Z类泡利算子,该Z类泡利算子为连接症状点b到边界的最短与P不对易的泡利算子。一般拓扑纠错码都可以有类似的简单表示映射。
L(P)是P所属的纠错码逻辑类中的一个特定算子(一旦选定,即将其固定)。如果考虑另一个泡利算子P′,有类似的分解:
P′=L(P′)T(S(P′))
如果S(P′)=S(P),且L(P′)与L(P)属于同一逻辑类,那么P′和P只差一个稳定子群的元素,也就是说他们在纠错的意义上是等价的。那么对任意泡利算子P,可以定义:
Pc=Lc(P)T(S(P))
其中,Lc(P)为L(P)所属逻辑类的一个固定代表元素,那么Pc称为P的正则表示,Lc(P)T(S(P))称为P的正则分解。将所有的泡利算子转换为其等价的正则表示。这样将对泡利算子不必要的多样性进行大幅度的限制,尤其是当选择泡利算子作为模型的输出时,将大幅减少模型训练的难度,从而提升训练过程的收敛速度。
2.多模型学习容错纠错
理论表明,对表面码的容错纠错可以在进行O(L)轮症状测量之后,将收集的所有症状信息汇总进行解码,以确保纠错码的应用纠错能力在容错场景下不受影响。
示例性地,症状测量电路可以如图7所示。其中,图7中(a)部分示出了侦测Z错误的稳定子生成元的本征值测量电路,图7中(b)部分示出了侦测X错误的稳定子生成元的本征值测量电路。在这个电路中,受控非门(Control not,CNOT)的作用顺序非常重要,不可有前后颠倒,不然会引起不同的量子门使用相同量子比特而引发的冲突。在此过程中,所有的步骤,包括受控非门,辅助态制备和最后的辅助态测量都会引起噪声。因为受控非门会传递错误,X和Z这两类症状测量又是相互嵌套的。图7所示的排放方式可以最大程度地减少错误的传播,使其对纠错能力的影响忽略不计。其他顺序排放会大大地降低纠错的能力。
这样,在经过多轮的症状测量之后,通过症状测量电路得到的错误症状信息就成为一个三维的0-1阵列,如图8所示。图8示出了一种三维症状分布的示意图,纵向是时间。可以看成是一个三维的0,1组成的数据阵列。在图8中,总共包括4个分片81,每一个分片81代表一次测量获取的错误症状信息。线条82代表Z错误引发的症状,线条83代表X错误引发的症状,线条84代表测量错误。
经过O(L)轮这样的症状测量之后,容错最优解码器(Maximum a posterior,MAP)的数学定义是:
其中S是3维的0,1组成的数据阵列,代表错误症状信息,是根据所测量的错误症状信息可以推测到在二维数据量子比特上可能性最大的错误。将对物理量子比特施加与相对应的操作去纠正发生在其上的物理错误。注意,与真实遗留在物理量子比特上的错误Ep不需要是一致的——只要两者之差的权重足够小,从而保证可以在下一轮纠错过程中被纠正过来即可。需要注意的是,由于每一个数据量子比特对应的分类结果包括如下4种情况:I、X、Y、Z,其中,I表示没有错误,X表示X错误,Z表示Z错误,Y表示既有X错误又有Z错误。并且,对于纠错码尺度为L的量子电路,其包含的数据量子比特的数量为L2。因此E的可能选择有种不同的泡利算子。由于不可能遍历计算所有E的概率,因此解码问题是在计算复杂度上是#P-Complete问题。为了有效进行解码,需要对此进行简化。首先可以考虑使用E的正则表达式:
Ec=Lc(E)T(S(E)
来代表所有等价的E。这样,需要遍历的泡利算子数目减少为原因在于,对纠错码来说,所有的错误乘上稳定子群中的一个元素都是等价的,因此在对错误进行分类的时候,可以把这些等价的错误合一块,因为稳定子群中包含个元素,因此这里只需要 遍历种类型的错误即可。接下来,凡涉及到泡利错误E,都用代表其等价泡利算子的正则表示Ec,不再进行区分。
更进一步,需要表征错误的信息,并将表征的信息分而治之,分成不同信息块,以降低遍历的复杂度。本申请提供两种错误分解方式。
第一种是将正则表示进行正则分解的方式。由于简单错误和症状的一一对应关系,可以用2个个症状的集合 中的元素来描述,Lc则可以用来确定。SX和SZ称为解码输出的正则症状。以SX为例,可以将其分解成:
每个包含一部分X类症状比特。对Z类症状比特,也有同样的拆解。因此,对Z类错误MAP解码过程也就近似成:
对X类错误也可以做类似的处理,对X类错误MAP解码过程也就近似成:
第二种是直接按照泡利错误在物理量子比特的分布进行分解的方式。可以将物理量子比特分为不同区块的无交并集:
这种划分表示标记为那么有:
其中Ei是E作用在Ri上的泡利算子,表示直积。这样,在解码过程中可以近似约化为:
其中
为Ei的边际概率分布(marginal probability),表示E作用在区块i之外的那部分算子,这里i的取值范围为[1,m]中的整数。与正则表达分解的情况类似,也可以进一步将错误分解成分别使用近似MAP进行解码:
不管使用哪种方式,选择的分割方式需要保证或者|Ri|~O(1),即其大小有上限,且上限独立于纠错码的尺度。需要在此限制的基础上尽可能提升解码性能。这样选择的原因基于的直觉是无论是症状和物理比特上发生的错误,其中存在的关联是有限的,且这种关联尺度不随L增长。
需要注意的是,MWPM仅仅使用X(Z)类症状来对Z(X)类错误进行解码。而这里可以要求同时使用所有的症状比特信息进行X和Z错误的解码。这是因为在Z和X的错误是互相关联的,因为其对应的症状比特也非完全独立。将所有的症状联合考虑将更加精确的判定X和Z错误的位置,而MWPM或者其他算法则还未利用这一点。
使用基于机器学习方式的目的无非是用神经网络近似Pr(Ei|S)这些分布函数。由于这些函数都有共同的输入S(三维症状比特),直接的方式是对每一个分布函数都用一个神经网络模型去近似,在网络的末端使用Softmax函数进行归一化,生成相应的概率分布,并根据分布的结果汇总成错误信息并在数据量子比特上施加纠正。整个解码过程如图9所示,通过m(m大于1)个神经网络模型分别对错误症状信息S进行解码处理,得到m组概率分布,然后根据该m组概率分布确定错误结果信息,该错误结果信息指示发生错误的数据量子比特以及相应的错误类型。
注意,这里每一个网络的输出规模有最大上限。因此神经网络的数量和物理量子比特数 目L2呈正比。
当每次症状测量是无噪声时,只需要一层症状测量即可,且无需再推测错误算子正则表达式中的简单错误部分,因此上述式1可以简化成:
对于编码单个比特的表面码来说,这种解码方式将简化为一个4分类问题,因此只需要一个网络即可完成解码工作。从拓扑的角度来看,在有测量噪声的容错场景下,无法将解码问题归结为类似的简单分类问题,因此使用神经网络的容错解码较完美症状情形要复杂地多。
请参考图10,其示出了本申请一个实施例提供的基于神经网络的量子纠错解码方法的流程图,该方法可应用于图4所示应用场景的控制设备中,该方法可以包括如下步骤1010~1040中的至少一个步骤:
步骤1010,获取对量子电路进行症状测量得到的错误症状信息。
其中,错误症状信息是由量子纠错码的稳定子生成元的本征值构成的数据阵列。
采用量子纠错码对量子电路进行错误症状测量,可以得到相应的错误症状信息,该错误症状信息是由量子纠错码的稳定子生成元的本征值构成的数据阵列。可选地,错误症状信息是由0和1组成的二维或三维的数据阵列。例如,当没有错误时,稳定子生成元的本征值为0;当有错误发生时,稳定子生成元的本征值为1。
以量子纠错码是表面码为例,对表面码来说,错误和错误症状是有具体的空间位置的:当有错误引起症状时,对应位置的辅助量子比特的本征值为1(可以看成是该位置出现了一个点粒子),没有错误时,对应位置的辅助量子比特的本征值为0。因此,对表面码来说,如果不考虑纠错过程自身的错误(也即测量过程如果是完美,称为完美症状),那么错误症状信息可以认为是一个由0,1组成的二维数据阵列。
可选地,如果对量子电路进行了多轮的症状测量,每一轮症状测量可以得到一个二维数据阵列形式的错误症状信息,多轮症状测量则可以得到一个三维数据阵列形式的错误症状信息,如图8所示。
步骤1020,使用神经网络解码器从错误症状信息中提取特征信息。
作为一种可选的实现方式,在使用神经网络解码器从错误症状信息中提取特征信息时,控制设备可以通过神经网络解码器的特征提取网络对错误症状信息进行特征提取,得到特征信息;其中,神经网络解码器包括特征提取网络和n个特征解码网络,n为大于1的整数。
神经网络解码器是基于神经网络构建的用于对错误症状信息进行解码的机器学习模型。该神经网络解码器的输入数据即为错误症状信息,输出数据即为与该错误症状信息对应的错误结果信息。
在本申请实施例中,神经网络解码器包括一个特征提取网络和多个特征解码网络。其中,特征提取网络用于对错误症状信息进行特征提取,得到特征信息。由特征提取网络输出的特征信息,将会分别输入至多个特征解码网络,由该多个特征解码网络分别对该特征信息进行解码处理,得到各个特征解码网络分别对应的解码结果。
在一些实施例中,特征提取网络可以基于CNN构建。在一些实施例中,特征解码网络可以基于FCN(Fully Connected Neural Network,全连通神经网络)。当然,本申请并不限定特征提取网络和特征解码网络还可以是其他网络结构。
图9中模型数目随着比特数量L2的上升而线性增长,带来较大的计算复杂度。过多的模型也极大地增加了在具体硬件上部署算法的难度。尽管这些模型可以并行运行,但复杂度的增加需要消耗的硬件资源增长过快,对FPGA或者ASIC的芯片个数需求量很大,造成系统集成方面的困难。因此尝试将这O(L2)个模型尽量抽取可复用的部分,组成一个前端(frontend),也即特征提取网络。前端的输出扇出到n~O(L2)个精简的特征解码网络,如前馈全连通网络(Feed-Forward Network,FFN)了,产生n个概率分布以完成解码,这n个不同的 特征解码网络组成模型的后端。
示例性地,整个模型如图11所示。前端为特征提取网络,该特征提取网络可以包括多个级联的特征提取子网络和一个特征融合子网络。特征提取子网络的作用是使用分治(divide and conquer)的方式提取局部特征信息,特征融合子网络则最终将所有局部特征信息汇总后进行压缩,最终得到特征信息。其中,特征提取子网络可以基于CNN构建,如每一个特征提取子网络包括一个或者多个卷积层。特征融合子网络可以基于全连通网络构建,如包括一个或者两个全连通层。
这是一个典型的多任务学习神经网络模型,其有效性建立在前端所提取的特征足够多,可以提供给任意后端的特征解码网络综合产生精确的噪声分解的局部信息分布(例如Pr(Ei|S))。在原理上这是合理的,因为后端可以被认为是一个进行全局分类器的局域简化版,而这成立的前提是,前端提供的信息原则上可以提供这个全局分类器进行泡利算子分布的计算。同时,这里要求在解码性能不受影响的前提下,前端的计算复杂度是工程上可以接受的。
前端和后端网络的规模视具体情况而定。由目前实验结论,后端的特征解码网络(如采用全连通层)的大小可以独立于纠错码尺度L,后端的模型参数数量则正比于O(L2),计算深度为O(1)。后端整体计算复杂度O(L2)。前端计算复杂度和整体算法的复杂性分析将在下文进行说明。
另外,上文介绍了两种错误分解方式,这两种错误分解方式都可以使用该基于多任务学习的模型架构。第一种错误分解方式的好处是实际网络规模较小,但是受限于训练数据产生的方式,最终所得解码器的性能会受到限制。第二种错误分解方式则可以提供端到端的训练,同时使用X类和Z类症状进行解码,提供大幅度的解码性能提升。在本申请实施例中,使用第一种错误分解方式进行训练、推理的神经网络解码器,称为第一类解码器;使用第二种错误分解方式进行训练、推理的神经网络解码器,称为第二类解码器。
可选地,神经网络解码器的特征提取网络在对错误症状信息进行特征提取时,使用分而治之的思想,采用分块特征提取方式。也即,每个或部分特征提取子网络用于对输入数据进行分块特征提取。所谓分块特征提取,是指特征提取子网络在提取特征信息时,将输入数据进行分块,划分为多个小块,对各个小块分别进行特征提取。也即,分块特征提取是指对输入数据进行分块得到至少两个块后,采用至少两个特征提取单元对该至少两个块进行并行的特征提取处理。其中,至少两个块和至少两个特征提取单元一一对应,每个特征提取单元用于对一个块进行特征提取,块和特征提取单元的数量相同。另外,上述至少两个块在进行特征提取时,是并行的,也即是同时进行的,从而有助于缩减特征提取所需的耗时。示例性地,如图11所示,错误症状信息是一个三维症状比特,将该三维症状比特划分为C1个块之后,通过第1个特征提取子网络对该C1个块进行并行的特征提取,得到C2个块。类似地,通过第2个特征提取子网络对该C2个块进行并行的特征提取,得到C3个块。以此类推,通过第k个特征提取子网络对Ck个块进行并行的特征提取,得到Ck+1个块,k为正整数。最后,通过特征融合子网络对上述Ck+1个块进行融合和压缩处理,得到特征信息,作为后端的输入。
步骤1030,通过所述神经网络解码器对特征信息进行解码处理,得到解码结果。
作为一种可选的实现方式,在神经网络解码器包括特征提取网络和n个特征解码网络的情况下,控制设备可以通过n个特征解码网络分别对特征信息进行解码处理,得到n个特征解码网络分别对应的解码结果;其中,n个特征解码网络是采用多任务学习的方式进行训练,以具备生成不同解码结果的能力的网络。
对于第一类解码器,步骤1030可以包括:
通过n1个特征解码网络分别对特征信息进行解码处理,得到n1个特征解码网络分别对应的解码结果;其中,对于n1个特征解码网络中的第i个特征解码网络,第i个特征解码网络对应的解码结果包括:与目标错误类型相关的第i项正则症状,i为小于或等于n1的正整数, 正则症状是指错误症状信息的正则分解结果;
通过n2个特征解码网络分别对特征信息进行解码处理,得到n2个特征解码网络分别对应的解码结果;其中,对于n2个特征解码网络中的第j个特征解码网络,第j个特征解码网络对应的解码结果包括:与目标错误类型相关的固定代表元素,j为小于或等于n2的正整数;其中,n1与n2之和等于n,且n1与n2均为正整数。
在一些实施例中,目标错误类型包括泡利X错误和泡利Z错误,n1等于m1与m2之和,m1与m2均为正整数,且n2等于2;
上述n1个特征解码网络中的m1个特征解码网络,用于分别对特征信息进行解码处理,得到m1项与泡利X错误相关的正则症状;
上述n1个特征解码网络中的m2个特征解码网络,用于分别对特征信息进行解码处理,得到m2项与泡利Z错误相关的正则症状;
上述n2个特征解码网络中的一个特征解码网络,用于对特征信息进行解码处理,得到与泡利X错误相关的固定代表元素;
上述n2个特征解码网络中的另一个特征解码网络,用于对特征信息进行解码处理,得到与泡利Z错误相关的固定代表元素。
可选地,m1与m2的取值可以相同,也可以不同。可选地,错误症状信息包括Z类症状信息和X类症状信息,使用Z类症状信息解码得到X类错误结果信息,该X类错误结果信息指示量子电路中发生泡利X错误的量子比特;使用X类症状信息解码得到Z类错误结果信息,该Z类错误结果信息指示量子电路中发生泡利Z错误的量子比特。
可选地,上述m1项与泡利X错误相关的正则症状可以表示为每个包含一部分Z类症状比特,用于解码确定X类错误。上述m2项与泡利Z错误相关的正则症状可以表示为每个包含一部分X类症状比特,用于解码确定Z类错误。上述与泡利X错误相关的固定代表元素可以表示为上述与泡利Z错误相关的固定代表元素可以表示为
对于第二类解码器,量子电路包含的量子比特被划分为n个区块,每一个区块中包含至少一个量子比特。对于n个特征解码网络中的第k个特征解码网络,第k个特征解码网络对应的解码结果包括:作用在n个区块中的第k个区块所包含的量子比特上的泡利算子,k为小于或等于n的正整数。
可选地,第k个区块可以记为Rk,作用在第k个区块Rk所包含的量子比特上的泡利算子可以表示为Ek。这样,n个特征解码网络能够得到分别作用在n个区块上的泡利算子E1,E2,…,En
步骤1040,根据解码结果,确定量子电路的错误结果信息。
在一些实施例中,上述错误结果信息指示量子电路中发生错误的量子比特。
可选的,上述错误结果信息还可以指示量子电路中发生错误的量子比特对应的错误类型。
作为一种可选的实现方式,在神经网络解码器包括特征提取网络和n个特征解码网络的情况下,控制设备可以根据n个特征解码网络分别对应的解码结果,确定错误结果信息。
基于神经网络解码器输出的错误结果信息,可以确定出量子电路中发生错误的量子比特以及相应的错误类型。例如,确定出量子电路中发生错误的数据量子比特的位置,以及该位置处发生错误的数据量子比特的错误类型,如是泡利X错误,还是泡利Z错误,还是泡利X错误和泡利Z错误都有(也即泡利Y错误)。
对于第一类解码器,步骤1040可以包括:
根据与泡利X错误相关的固定代表元素,以及m1项与泡利X错误相关的正则症状,确定X类错误结果信息,X类错误结果信息指示量子电路中发生泡利X错误的量子比特;也即,根据确定X类错误结果信息
根据与泡利Z错误相关的固定代表元素,以及m2项与泡利Z错误相关的正则症状,确 定Z类错误结果信息,Z类错误结果信息指示量子电路中发生泡利Z错误的量子比特;也即,根据确定Z类错误结果信息
具体有关上述确定的原理,可以参见上文实施例中的介绍说明。
在一些实施例中,对于第一类解码器,可以训练两个神经网络解码器,记为第一神经网络解码器和第二神经网络解码器。如图12所示,通过第一神经网络解码器的特征提取网络,对Z类错误症状信息进行特征提取,得到第一特征信息,然后通过第一神经网络解码器的m1+1个特征解码网络分别对第一特征信息进行解码处理,得到该m1+1个特征解码网络分别对应的解码结果,其中m1个特征解码网络对应的解码结果包括m1项与泡利X错误相关的正则症状,另外1个特征解码网络对应的解码结果包括与泡利X错误相关的固定代表元素,根据m1项与泡利X错误相关的正则症状以及与泡利X错误相关的固定代表元素,确定X类错误结果信息,该X类错误结果信息指示量子电路中发生泡利X错误的量子比特。另外,通过第二神经网络解码器的特征提取网络,对X类错误症状信息进行特征提取,得到第二特征信息,然后通过第二神经网络解码器的m2+1个特征解码网络分别对第二特征信息进行解码处理,得到该m2+1个特征解码网络分别对应的解码结果,其中m2个特征解码网络对应的解码结果包括m2项与泡利Z错误相关的正则症状,另外1个特征解码网络对应的解码结果包括与泡利Z错误相关的固定代表元素,根据m2项与泡利Z错误相关的正则症状以及与泡利Z错误相关的固定代表元素,确定Z类错误结果信息,该Z类错误结果信息指示量子电路中发生泡利Z错误的量子比特。上述第一神经网络解码器和第二神经网络解码器均可以采用上文介绍的多任务学习的方式进行训练,且第一神经网络解码器和第二神经网络解码器中包含的特征解码网络的数量,可以相同,也可以不同,本申请对此不作限定。
另外,在图12所示示例中,仅以将错误症状信息分解为Z类错误症状信息和X类错误症状信息两部分,并分别输入至第一神经网络解码器和第二神经网络解码器为例,这有助于降低神经网络解码器的计算复杂度。在一些其他实施例中,也可以不对错误症状信息进行分解,直接将该错误症状信息分别输入至第一神经网络解码器和第二神经网络解码器,由第一神经网络解码器根据该错误症状信息得到X类错误结果信息,由第二神经网络解码器根据该错误症状信息得到Z类错误结果信息。
对于第二类解码器,步骤1040可以包括:根据分别作用在n个区块上的泡利算子,确定错误结果信息。也即,根据E1,E2,…,En确定错误结果信息具体有关上述确定的原理,可以参见上文实施例中的介绍说明。
在一些实施例中,如图13所示,对于第二类解码器,错误结果信息指示量子电路中发生泡利X错误的量子比特以及发生泡利Z错误的量子比特。也即错误症状信息并不区分X类错误症状信息和Z类错误症状信息,而是联合起来同时使用所有的症状比特进行X和Z错误的解码,相应地,解码结果也不区分X类错误结果信息和Z类错误结果信息,而是直接解码得到包含X类错误和Z类错误的错误结果信息。由于X和Z错误之间是互相关联的,通过上述方式,将所有的症状比特联合起来考虑,能够更加精确地判定X和Z错误的位置。通过下文的实验数据,也将看到这样做能够带来解码性能的大幅提升。在这种情况下,仅需一个神经网络解码器即可完成对错误症状信息的解码,得到错误结果信息。
在一些实施例中,对于第二类解码器,也可以使用两个神经网络解码器,记为第一神经网络解码器和第二神经网络解码器。如图14所示,第一神经网络解码器和第二神经网络解码器的输入均为错误症状信息,并不区分X类错误症状信息和Z类错误症状信息,而是联合起来同时使用所有的症状比特进行X和Z错误的解码。通过第一神经网络解码器的特征提取网络,对错误症状信息进行特征提取,得到第一特征信息,然后通过第一神经网络解码器的n个特征解码网络分别对第一特征信息进行解码处理,得到n个特征解码网络分别对应的第一解码结果,其中,第k个特征解码网络对应的第一解码结果包括作用在第k个区块上的与X类错误相关的泡利算子,根据n个特征解码网络分别对应的第一解码结果,确定X类错误结 果信息,该X类错误结果信息指示量子电路中发生泡利X错误的量子比特。另外,通过第二神经网络解码器的特征提取网络,对错误症状信息进行特征提取,得到第二特征信息,然后通过第二神经网络解码器的n个特征解码网络分别对第二特征信息进行解码处理,得到n个特征解码网络分别对应的第二解码结果,其中,第k个特征解码网络对应的第二解码结果包括作用在第k个区块上的与Z类错误相关的泡利算子,根据n个特征解码网络分别对应的第二解码结果,确定Z类错误结果信息,该Z类错误结果信息指示量子电路中发生泡利Z错误的量子比特。上述第一神经网络解码器和第二神经网络解码器均可以采用上文介绍的多任务学习的方式进行训练,且第一神经网络解码器和第二神经网络解码器中包含的特征解码网络的数量,可以相同,也可以不同,本申请对此不作限定。
在本申请实施例中,对神经网络解码器的输出类型不作限定。在一种可能的实现方式中,采用物理级输出,物理级输出的模型直接生成具体的错误发生的量子比特信息,即具体哪个量子比特发生了何种类型的错误。在另一种可能的实现方式中,采用逻辑级输出,逻辑级输出的模型输出的是一个具体错误经过特定映射之后的逻辑错误类,然后可以根据这个逻辑错误类反推具体发生在量子比特上的等价错误(这个推得的错误未必与原始发生的错误相同,但效果是一样的,这是量子纠错码特有的错误简并现象)。可选地,为了降低神经网络解码器的复杂度,从而进一步缩减解码时间,神经网络解码器可以使用逻辑级输出。
在一些实施例中,在神经网络解码器对已获取的所错误症状信息进行解码的过程中,并行执行新的错误症状信息的测量采集过程。通过将症状测量与解码并行化,可以无需等到O(L)次症状测量都结束之后才开始解码。当有足够的症状比特完成最小单元计算时(例如单个卷积运算)时,就可以开始进行相应的计算。这样,在后续的症状测量期间即可以开始进行解码,两者得到并行化。这将减少最后一轮症状测量结束至完成纠错的整体延迟时间。为了防止纠错过程中错误的积累,这个延迟时间越短越好。
在一些实施例中,神经网络解码器包括的特征提取网络和n个特征解码网络,部署在同一块芯片上。可选地,该芯片可以是FPGA或者ASIC。可选地,对于部分实施例中需要两个神经网络解码器的情形,该两个神经网络解码器可以部署在同一块芯片上,也可以部署在2块芯片上,本申请对此不作限定。
综上所述,本申请实施例提供的技术方案,提供了一种基于多任务学习神经网络模型的纠错解码方案,通过将神经网络解码器从输入的错误症状信息中提取相应的特征信息,再通过神经网络解码器对该特征信息进行解码,输出噪声分解的局部信息分布的解码结果,然后根据解码结果确定错误结果信息,相比于采用多个神经网络解码器的方案,本申请方案只需要单个神经网络解码器即可以准确的确定错误结果信息,从而在不增加算法复杂度且保持可扩展性的前提下,充分地提升了解码性能并缩短了解码时间,且在工程上也较容易实现硬件部署;比如,上述实施例通过将神经网络解码器设计为包括特征提取网络和多个特征解码网络的结构,特征提取网络从输入的错误症状信息中提取相应的特征信息,该特征信息同时作为多个特征解码网络的输入,由该多个特征解码网络输出噪声分解的局部信息分布,然后根据该多个特征解码网络的解码结果确定错误结果信息,相比于采用多个神经网络解码器的方案,从而在不增加算法复杂度,保持可扩展性的前提下,充分地提升了解码性能并缩短了解码时间,且在工程上也较容易实现硬件部署。
另外,对于上述第二类解码器,直接按照泡利错误在物理量子比特的分布进行分解的方式,可以提供端到端的推理和训练。并且,由于X和Z错误之间是互相关联的,通过同时使用X类和Z类症状进行解码,将所有的症状比特联合起来考虑,能够更加精确地判定X和Z错误的位置,提供大幅度的解码性能提升。
在一些实施例中,对于神经网络解码器的特征提取网络,本申请提出使用LFEM(Local Feature Extraction Mapping,局域特征提取映射)来压缩计算复杂性。
除了因为测量噪声带来输出端的复杂性(通过多任务学习来解决),神经网络解码器的另一个复杂性在于自身训练和推理的复杂性。本申请实施例给出如下解决思路:将大尺度的纠错码看成是多个小尺度的纠错码(可以称之为“小纠错码”),对“小纠错码”进行局域“解码”之后,将所得到的信息汇总在更高一个层次进行“解码”。这个过程可以递归,直到最终的解码信息即为需要进行纠正的错误。每一层对某个特定区域“小纠错码”的解码可以称之为LFEM。
在一些实施例中,特征提取网络包括多个级联的特征提取子网络;其中,第1个特征提取子网络的输入数据包括错误症状信息,第s个特征提取子网络的输入数据包括第s-1个特征提取子网络的输出数据,最后一个特征提取子网络的输出数据包括特征信息,s为大于1的整数。
对于多个级联的特征提取子网络中的目标特征提取子网络,该目标特征提取子网络的输入数据被划分为尺度相同的多个输入数据块。目标特征提取子网络可以是上述多个级联的特征提取子网络中的任意一个特征提取子网络。目标特征提取子网络用于对多个输入数据块执行多次局域特征提取映射,得到多组映射输出数据。其中,每一次局域特征提取映射用于对多个输入数据块中相同位置的区域进行映射处理,得到一组映射输出数据;不同次的局域特征提取映射用于对多个输入数据块中不同位置的区域进行映射处理,得到多组映射输出数据。目标特征提取子网络还用于根据上述多组映射输出数据,得到目标特征提取子网络的输出数据。
如图15所示,其示例性示出了一个特征提取子网络进行局域特征提取映射的示意图。图15示出了对两个不同位置的区域(图中以标号①和②区分)进行LFEM的过程,单次LFEM作用在输入的Ci个输入数据块中相同位置的区域。不同次的LFEM作用在输入的Ci个输入数据块中不同位置的区域,如图中示出了①和②两个不同位置的区域。
在一些实施例中,不同位置的区域之间存在重叠。所谓不同位置的区域之间存在重叠,是指不同位置的区域之间存在重叠的数据。另外,可以在三维方向中的部分或全部方向上存在重叠。为了压缩计算复杂度,各层的“小纠错码”之间的重叠在三维方向都较小。这样,局域“纠错”的层数的增长为O(logL)。通过将不同位置的区域之间设置为存在重叠,有助于提升解码效果,这是因为同一部分信息被两个LFEM作用后可以进行交叉验证提升解码性能。但是不宜重叠过度产生额外计算复杂度。需要说明的是,不同LFEM所涉及的网络本身的参数是一致的,只是作用于输入的区域不同。
在一些实施例中,目标特征提取子网络包括:至少一个卷积层以及至少一个全连通层。至少一个卷积层用于对多个输入数据块执行多次局域特征提取映射,得到多组映射输出数据。至少一个全连通层用于根据多组映射输出数据,得到目标特征提取子网络的输出数据。
在构造特征提取子网络时,最简单的方式是使用单层3D CNN,但这种神经网络的表达能力有限,当纠错码尺度较大时解码性能受到较大的影响。因此考虑在3D CNN kernels作用在所有输入三维信息块(总共Ci个)的同一区域之后(图15中相同标号的虚线框),再连接到一个FFN进行进一步的信息整合和压缩,这个FFN可以包含层数不等的全连通层,实际情况中可以限定最大层数为2。
整体参数复杂度分析:在硬件执行实时解码时,需要事先将参数配置到计算设备上(如FPGA或者ASIC)。参数的数目将决定最终将占据多少片上内存。每一个特征提取子网络的参数数目由其LFEM的结构决定。其中第i层3D CNN参数个数为KiCiM3,其中M=max{ki}为最大尺度卷积核边的大小。FFN参数为Ci+1Ki。如果Ci,Ci+1,Ki~O(1),前端总参数数目为:
后端的参数数目则为O(L2),因此总参数数目为O(L2)。隐藏在O下的常数通常会较大,因此在L较小的情况下,实际前端占用的参数可能较后端为多。无论是这个渐进增长,还是根据测试所得真实的模型的实际参数,实际工程中都是可以接受的。
整体算法的深度(或者计算时间):这一部分由最快执行乘法和加法的方式来决定。特征提取子网络中3D CNN和FFN部分(FFN层数<=2)的所有乘法运算可以最快在O(1)时间内完成,根据乘法之后的累加过程需要O(logCi)~O(1)步,因此单个特征提取子网络总共计算时间为O(1)。总共有O(logL)个特征提取子网络,因此前端总计算时间(深度)为O(logL)。后端总计算时间可以完全并行化,且后端O(L2)个特征解码网络的尺寸独立于L,层数为O(1),乘法计算时间为O(1),累加计算时间为O(logC)~O(1)。因此整个算法的深度为O(logL)。这个算法时间即为计算资源足够充足的情况下理论上所能达到的最短计算时间。
整体运算复杂度分析:每一层的输入三维特征块即为上一层的输出特征块。假设第i层的输入特征块个数为Ci,输出特征块数量为Ci+1。如果Ci,Ci+1,Ki~O(1),每个LFEM的输入尺度为O(1),则这一层的需要作用的LFEM数目为其中k=min{ki}~O(1)为整个网络的最小卷积内核。可以得到前端的乘法计算量为:
由于乘法计算量占主导,后端的乘法计算量为~O(L2),因此,整个解码过程的计算过程的复杂度可以控制在O(L3)。如果不使用多任务学习,则整个解码过程计算复杂度增长为O(L2L3)=O(L5),在实际工程中是不可接受的。
另外,神经网络的激发层有很多种不同的类型。为了硬件部署方便考虑,本申请只使用其中两种,即ReLU和LeakyReLU。其中LeakyReLU的定义为:
选择a<0,且要求N+代表正整数集。这样做的好处是对激发层的计算只需要判定符号位和进行右移有限位即可,大幅度简化了计算实现方式。仿真表明,LeakyReLU已足以提供相当好的效果。
在一些实施例中,神经网络解码器的训练过程如下:
1.获取样本错误症状信息,以及样本错误症状信息样本对应的样本错误结果信息;
2.通过待训练的神经网络解码器根据样本错误症状信息,得到n个特征解码网络分别对应的预测解码结果;
3.根据n个特征解码网络分别对应的预测解码结果,以及基于样本错误结果信息确定的n个特征解码网络分别对应的标签解码结果,确定n个特征解码网络分别对应的损失函数值;
4.根据n个特征解码网络分别对应的损失函数值,确定总损失函数值;
5.根据总损失函数值对待训练的神经网络解码器进行参数调整,得到训练后的神经网络解码器。
设定了模型网络结构,需要对模型进行训练。由于在输出端需要学习多个分布函数,对n个特征解码网络的输出分别使用交叉熵(Cross Entropy)损失函数进行训练。同时在产生输入-输出训练数据时,指定输入端即为随机产生的症状(可以是单独X或Z症状,也可以是同时包含这两类症状),而输出端为该症状对应的输出,并进行独热编码(one hot)。注意,对同一个症状的输入,可能对应不同的输出。这种输出端的多样性就使得训练过程中模型最终可以学到根据特定输入症状的输出概率分布。
对于上述n个特征解码网络中的第i个特征解码网络,根据该第i个特征解码网络对应的预测解码结果,以及基于样本错误结果信息确定的第i个特征解码网络对应的标签解码结果,确定第i个特征解码网络对应的损失函数值。其中,第i个特征解码网络对应的损失函数值用于衡量第i个特征解码网络对应的预测解码结果和第i个特征解码网络对应的标签解码结果之间的相似度。第i个特征解码网络对应的标签解码结果是指预先设定的该第i个特征 解码网络的输出标签,也可以理解为期望的输出结果。对第i个特征解码网络的训练的目标是让其对应的预测解码结果尽可能地与该标签解码结果相同或接近。对于n个特征解码网络中的每一个特征解码网络,均可以采用上述方式,确定该特征解码网络对应的损失函数值。
在得到n个特征解码网络分别对应的损失函数值之后,可以对该n个特征解码网络分别对应的损失函数值进行加权求和,得到总损失函数值。该总损失函数值用于表征整个神经网络解码器的性能。另外,各个特征解码网络对应的权重值,可以相同,也可以不同,本申请对此不作限定。
之后,可以采用梯度下降法,以最小化总损失函数值为目标,计算得到神经网络解码器的参数调整梯度,根据该参数调整梯度对待训练的神经网络解码器进行参数调整,得到训练后的神经网络解码器。其中,神经网络解码器的参数包括神经网络解码器中包含的各个神经网络的权重参数。
需要注意的是,如果使用错误正则分解作为输出(第一种分解方式),则必须确保训练数据中输入和输出的一一对应性。这是因为如果一个输入症状对应多个输出的估计症状SX|Z,那么,由于切断了推理症状之间关联性,这种单输入多输出情形将造成以O(1)概率产生局部正则症状推理不一致。与推测物理比特上发生的噪声不同,一旦正则症状推理不一致,将以O(1)概率立刻引起解码失败。如果直接由仿真数据产生训练数据,则无法保证这输入输出一一对应这一点。因此,在使用正则表示分解时,需要用一个第三方解码器来产生一一对应的输入输出。一个自然的选择是使用MWPM解码器根据仿真产生症状产生单一的输出。在不考虑计算复杂度的情况下,也可以考虑使用更好和更复杂的已知解码器。在这种情况下,可以将两类症状分离分别使用进行两类错误的解码(因为这是MWPM产生的输入输出模式),在损失较小计算复杂性的前提下压缩整体计算复杂度。
对于第二种分解方式,则直接使用仿真产生的原始错误数据的正则表示作为多任务学习输出标签,并同时使用X和Z类症状作为模型输入。这样就允许训练集中同一个输入症状对应多个输出错误。这是因为根据物理比特分布推理得到不同的结果至多只会引起局部残留物理比特错误,不会构成逻辑错误。同时对物理错误分布的学习,而不是只是最大似然的学习可以得到更好的解码性能。
实际训练过程中,对两种分解方式,都可以使用经典的Adam算法,batch size可以取到1000以上。考虑特征解码网络对应的损失函数为lossi(这里的i表示第i个特征解码网络,取正整数),则可以将所有的损失函数相加产生总损失函数:
其中,Loss为神经网络解码器的总损失函数,lossi表示第i个特征解码网络对应的损失函数,αi表示第i个特征解码网络对应的损失函数的权重值。并使用Adam算法对损失函数Loss进行梯度下降式多任务联合学习。在实际情况下,可以设定所有αi为1,当然也可以设定为其他数值。在每个训练纪元,将学习率(learning rate)逐步递减,或者先提升再降低,视实际情况灵活处理即可。
在一些实施例中,区块的划分与错误之间的关联性有关,同一个区块中包含的量子比特容易产生具有关联性的错误。
其中,上述同一个区块中包含的量子比特容易产生具有关联性的错误,是指同一个区块中包含的量子比特之间产生具有关联性的错误的概率,大于不同区块中包含的量子比特之间产生具有关联性的错误的概率。
比如,同一个区块中包含的量子比特之间产生具有关联性的泡利X错误或泡利Z错误的概率,大于不同区块中包含的量子比特之间产生具有关联性的泡利X或泡利Z错误的概率。
如前所述,如果采用第二种分解方式,可以使用仿真过程中产生原始错误而非其他解码器生成的间接错误数据对模型进行训练。如上文所述,为了大规模减少纠错算法的复杂度,可以将物理量子比特进行拆分:
并且在多任务学习时,仅考虑作用在Ri上的错误如果选择适当的Ri对解码性能的影响较大,因为希望Ri所包含的量子比特数目控制在某个常数以内,同时Ri又可以尽可能多地囊括所有可能的局部错误关联。这其中最典型的错误关联是由症状测量电路所带来的。如图16所示,症状测量电路产生的典型关联错误是延斜线的两体X和Z错误,图16中虚线框161代表两体X错误,虚线框162代表两体Z错误。因此,在做比特区域R的划分是,针对两类错误,需要尽可能多地包含这种延斜线分布的方式。
图17给出了L=9情况下的一种针对Z错误的量子比特划分方式在图17中,每一个白色小圆圈代表一个物理量子比特,每一个加深的黑色线条所连接的物理量子比特属于同一区块。这种方式从下到上来看,|Ri|的数量分别为{12,12,12,12,12,12,9}个物理量子比特所包含的泡利错误EZ。对X类错误,则需要将各个分割的量子比特子集进行90度旋转。
通过在对量子电路中的物理量子比特进行区块划分时,考虑错误之间的关联性,使得同一个区块中包含的量子比特容易产生具有关联性的错误,这有助于进一步提升解码性能。
在一些实施例中,在神经网络解码器的训练过程中,采用多种不同的区块划分方式对样本量子电路包含的量子比特进行划分,并基于多种不同的区块划分方式对神经网络解码器进行联合训练。在神经网络解码器的使用过程中,采用多种不同的区块划分方式中的一种区块划分方式,对量子电路包含的量子比特进行划分。
当考虑第二类解码器,并使用仿真过程中产生原始错误进行端到端训练后,在低物理错误率范围,解码效果会出现所谓的error floor现象,即在低物理错误率,逻辑错误随物理错误的下降变得十分缓慢,失去纠错码原本应有的解码效果。
这是因为当物理错误率足够低时,起主导作用的是高阶关联错误。而这些错误难以同时被覆盖在的同一区域Ri中。因此进行解码时,得到的边际概率无法覆盖这些关联性。与此同时,需要将区域Ri的尺寸控制在一定的尺寸范围内,以降低输出端的复杂度。事实上,除非覆盖所有的物理比特,Ri所包含的量子比特数并不是越大越好,更重要是其是否有效地包含可能的高阶关联噪声。在此限制下,为了降低高阶关联噪声的影响,采用多种划分,并在训练时进行交叉验证的方式。具体方式如下:
考虑t种划分:
这些划分应在尊重关联噪声不同的前提下产生最大差异化。图18示出了L=9的量子比特的另一种划分,与图17的划分方式有所不同。
在训练阶段,同时在解码神经网络后端放置由个划分对应的多任务网络。在训练阶段这些不同划分的对应区域的分布进行联合训练。具体来说,需要重新定义总损失函数:
并使用随机梯度下降进行学习。这样起到在训练阶段交叉验证的目的,在训练阶段即尽可能排除高阶关联错误的影响。一旦训练完毕,真正用来进行解码时则只使用这些划分中的一个,例如这样,虽然训练的复杂度增加了,但纠错算法本身的复杂度和计算时间都没有发生变化,不会影响实际解码延迟和工程部署。仿真实验结果证明,只需要t=2,即可以 大部分消除高阶关联错误的影响,大幅度提升纠错算法在低物理错误率区间的逻辑错误率。
在一些实施例中,部署上述神经网络解码器的芯片可以采用单核架构,也可以采用多核架构。这里的单核架构和多核架构是指包含的处理器(或称为处理核心,core)的数量。
对于采用单核架构的情形,芯片包括单个处理器,通过该单个处理器完成上文实施例介绍的神经网络解码器的所有执行步骤。在上文已经介绍,本申请提供的解码方法的计算复杂度为O(L3),在L较小时,采用单核架构能够承受该计算复杂度;但当L较大时,单核架构会存在限制。因此,本申请提出了多个架构的解决方案。
对于采用多核架构的情形,芯片包括呈树形结构的多个处理器。在本申请实施例中,对于多核架构下,芯片包括的处理器的数量不作限定,具体可以结合L的大小或者计算复杂度去设计,在充分利用每一个处理器的算力的前提下,完成整个解码算法。
对于采用多核架构的情形,任意两个不具有连接关系的处理器具有并行性,这样可以最大程度地发挥各个处理器的计算能力,缩短解码时间。另外,任意两个具有连接关系的处理器可以顺序执行。例如,图19示出了一种多核架构的示意图。处理器1~处理器p两两之间可以不具有连接关系,该p个处理器可以并行执行,如采用分治的策略对错误症状信息中的不同块进行并行处理,和/或,采用LFEM对不同的输入数据块进行局域特征提取映射。处理器1~处理器p提取的特征数据送入到处理器p+1,由该处理器p+1对处理器1~处理器p提供的特征数据进行处理,得到特征信息。然后,该特征信息分别输入至处理器p+2~处理器N,处理器p+2~处理器N两两之间可以不具有连接关系,该多个处理器可以并行执行,如每个处理器部署一个特征解码网络,对特征信息进行解码处理,得到相应的解码结果。最后,可以通过某个处理器根据各个特征解码网络分别对应的解码结果,确定错误结果信息。
对于并行执行的多个处理器,可以通过将待处理的信息在相同的时间发送给各个处理器,以使得该多个处理器能够并行执行。例如,对于上面示例中的处理器1~处理器p,可以将错误症状信息中的不同块在相同的时间发送给各个处理器,以使得该p个处理器能够并行执行。又例如,对于上面示例中的处理器p+2~处理器N,可以将特征信息在相同的时间发送给各个处理器,以使得该多个处理器能够并行执行。在一些其他实施例中,也可以设置单独的控制器或者处理器,来对各个处理器的执行时序进行控制,例如控制多个具有并行性的处理器同时开始执行,控制串行的处理器按照顺序执行,从而能够更好地协调各个处理器的工作,保证处理流程的正确性和稳定性。
由于本申请提供的基于多任务学习的神经网络解码器,不论是特征提取部分,还是特征解码部分,都存在内在并行性,因此可以很方便地分布到多个不同的处理器上执行。而且,不同处理器的输入大致独立,几乎不需要处理器到处理器的通信,仅有少部分的处理器之间需要进行通信以传输数据。该方法是可无限并行化的,在充分利用每个处理器的情况下,始终可以通过添加更多的处理器来扩展计算规模,以保持O(logL)解码延迟。
经过仿真实验表明,本申请提供的技术方案可以带来如下几方面的提升。
1.减少模型的数量,方便硬件系统部署
无论纠错码尺度有多大,都使用两个模型——一个输出X类错误,一个输出Z类错误。从目前FPGA的计算能考虑,先聚焦第一类解码器(输出端错误正则分解)。仿真结果如图20所示,并使用MWPM产生的间接训练数据进行训练之后,不同输出正则症状尺寸的解码性能。可以看到,这种情况下解码性能几乎独立于输出正则症状尺寸。同时,在仅使用2个模型的情况下,解码性能基本逼近对其进行训练的MWPM解码器,尤其在低物理错误率的情况下。这说明本申请提供的多任务学习解码器的共享前端是确实可以抓取进行高性能解码所需的所有特征信息。
具体到实际部署在硬件上的性能,集中在L=5(数据比特和辅助比特总共49个),进行 10轮症状测量的情况加以研究。考虑将输出端划分为三个输出,分别对应2个分别包含12比特信息的正则症状所对应的模型采用33万参数。将这个模型进行8位无符号量化(UINT8)后,将两个网络分别部署与两个Intel Stratix 10SX FPGA之上,如图21所示,图中201和202表示两个Intel Stratix 10SX FPGA,分别用于解码X和Z类错误。
为了模拟整个解码过程,在电脑端模拟量子噪声的产生和运行含噪声的症状测量电路,进行10轮症状测量之后,将得到的症状(120个经典比特信息)拆分成X和Z类症状后(分别60个),通过网络分口传送到2个FPGA上。当FPGA完成解码之后,再将解码所得错误信息传送到电脑端判定结果是否解码成功。经过大规模蒙卡试验之后,通过FPGA的解码性能如图22所示,整体解码延时700ns。通过使用更高端的Intel Stratix 10SX,并在接受到部分症状时即开始解码,那么从接受到症状到完成整个解码过程,总共耗时280ns,为已知对49比特真正实现硬件解码的最快记录。
2.大幅提升解码性能
采用第二类解码器可以大幅提升解码性能——通过使用:
(1)第二类错误分解
(2)同时使用两类症状作为解码的输出
(3)尊重关联错误的产生模式进行输出区域的划分
(4)使用多个不同的划分方式进行训练阶段联合学习交叉验证
本申请提供的解码器可以较低的计算复杂度、网络复杂度和计算深度,大幅提升实际解码的能力。
图23给出了L=5和L=7(分别对应49和97个物理比特,包含数据和辅助比特)。可以看出,在所有的物理错误区域,第二类解码器都比MWPM得解码性能要好得多,其逻辑错误率小于等于MPWM的1/2——这是目前已知性能最好的容错解码器了。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图24,其示出了本申请一个实施例提供的基于神经网络的量子纠错解码装置的框图。该装置具有实现上述方法示例的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是计算机设备,也可以设置在计算机设备中。该装置2400可以包括:症状获取模块2410、特征提取模块2420、特征解码模块2430和结果确定模块2440。
症状获取模块2410,用于获取对量子电路进行症状测量得到的错误症状信息;
特征提取模块2420,用于使用神经网络解码器从所述错误症状信息中提取特征信息;其中,所述神经网络解码器包括所述特征提取网络和n个特征解码网络,n为大于1的整数;
特征解码模块2430,用于通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果;
结果确定模块2440,用于根据所述解码结果,确定量子电路的错误结果信息。
在一些实施例中,特征提取模块2420,用于通过神经网络解码器的特征提取网络对所述错误症状信息进行特征提取,得到特征信息;其中,所述神经网络解码器包括所述特征提取网络和n个特征解码网络,n为大于1的整数。
特征解码模块2430,用于通过所述n个特征解码网络分别对所述特征信息进行解码处理,得到所述n个特征解码网络分别对应的解码结果;其中,所述n个特征解码网络的网络采用多任务学习的方式进行训练,以具备生成不同解码结果的能力的网络。
结果确定模块2440,用于根据所述n个特征解码网络分别对应的解码结果,确定错误结果信息。
在一些实施例中,所述量子电路包含的量子比特被划分为n个区块,每一个区块中包含至少一个量子比特。对于所述n个特征解码网络中的第k个特征解码网络,所述第k个特征 解码网络对应的解码结果包括:作用在所述n个区块中的第k个区块所包含的量子比特上的泡利算子,k为小于或等于n的正整数。
所述结果确定模块2440,用于根据分别作用在所述n个区块上的泡利算子,确定所述错误结果信息。
在一些实施例中,所述错误结果信息指示所述量子电路中发生泡利X错误的量子比特以及发生泡利Z错误的量子比特。
在一些实施例中,所述区块的划分与错误之间的关联性有关,同一个区块中包含的量子比特容易产生具有关联性的错误。
在一些实施例中,在所述神经网络解码器的训练过程中,采用多种不同的区块划分方式对样本量子电路包含的量子比特进行划分,并基于所述多种不同的区块划分方式对所述神经网络解码器进行联合训练。在所述神经网络解码器的使用过程中,采用所述多种不同的区块划分方式中的一种区块划分方式,对所述量子电路包含的量子比特进行划分。
在一些实施例中,所述特征解码模块2430,用于:
通过n1个特征解码网络分别对所述特征信息进行解码处理,得到所述n1个特征解码网络分别对应的解码结果;其中,对于所述n1个特征解码网络中的第i个特征解码网络,所述第i个特征解码网络对应的解码结果包括:与目标错误类型相关的第i项正则症状,i为小于或等于n1的正整数,所述正则症状是指所述错误症状信息的正则分解结果;
通过n2个特征解码网络分别对所述特征信息进行解码处理,得到所述n2个特征解码网络分别对应的解码结果;其中,对于所述n2个特征解码网络中的第j个特征解码网络,所述第j个特征解码网络对应的解码结果包括:与所述目标错误类型相关的固定代表元素,j为小于或等于n2的正整数;
其中,n1与n2之和等于n,且n1与n2均为正整数。
在一些实施例中,所述目标错误类型包括泡利X错误和泡利Z错误,n1等于m1与m2之和,m1与m2均为正整数,且n2等于2;
所述n1个特征解码网络中的m1个特征解码网络,用于分别对所述特征信息进行解码处理,得到m1项与所述泡利X错误相关的正则症状;
所述n1个特征解码网络中的m2个特征解码网络,用于分别对所述特征信息进行解码处理,得到m2项与所述泡利Z错误相关的正则症状;
所述n2个特征解码网络中的一个特征解码网络,用于对所述特征信息进行解码处理,得到与所述泡利X错误相关的固定代表元素;
所述n2个特征解码网络中的另一个特征解码网络,用于对所述特征信息进行解码处理,得到与所述泡利Z错误相关的固定代表元素;
所述结果确定模块2440,用于:
根据与所述泡利X错误相关的固定代表元素,以及m1项与所述泡利X错误相关的正则症状,确定X类错误结果信息,所述X类错误结果信息指示所述量子电路中发生所述泡利X错误的量子比特;
根据与所述泡利Z错误相关的固定代表元素,以及m2项与所述泡利Z错误相关的正则症状,确定Z类错误结果信息,所述Z类错误结果信息指示所述量子电路中发生所述泡利Z错误的量子比特。
在一些实施例中,所述特征提取网络包括多个级联的特征提取子网络;其中,第1个特征提取子网络的输入数据包括所述错误症状信息,第s个特征提取子网络的输入数据包括第s-1个特征提取子网络的输出数据,最后一个特征提取子网络的输出数据包括所述特征信息,s为大于1的整数;
对于所述多个级联的特征提取子网络中的目标特征提取子网络,所述目标特征提取子网络的输入数据被划分为尺度相同的多个输入数据块;
所述目标特征提取子网络用于对所述多个输入数据块执行多次局域特征提取映射,得到多组映射输出数据;其中,每一次局域特征提取映射用于对所述多个输入数据块中相同位置的区域进行映射处理,得到一组映射输出数据;不同次的局域特征提取映射用于对所述多个输入数据块中不同位置的区域进行映射处理,得到所述多组映射输出数据;
所述目标特征提取子网络还用于根据所述多组映射输出数据,得到所述目标特征提取子网络的输出数据。
在一些实施例中,所述不同位置的区域之间存在重叠。
在一些实施例中,所述目标特征提取子网络包括:至少一个卷积层以及至少一个全连通层;
所述至少一个卷积层用于对所述多个输入数据块执行所述多次局域特征提取映射,得到所述多组映射输出数据;
所述至少一个全连通层用于根据所述多组映射输出数据,得到所述目标特征提取子网络的输出数据。
在一些实施例中,在所述神经网络解码器对已获取的所述错误症状信息进行解码的过程中,并行执行新的错误症状信息的测量采集过程。
在一些实施例中,所述神经网络解码器的训练过程如下:
获取样本错误症状信息,以及所述样本错误症状信息样本对应的样本错误结果信息;
通过待训练的所述神经网络解码器根据所述样本错误症状信息,得到所述n个特征解码网络分别对应的预测解码结果;
根据所述n个特征解码网络分别对应的预测解码结果,以及基于所述样本错误结果信息确定的所述n个特征解码网络分别对应的标签解码结果,确定所述n个特征解码网络分别对应的损失函数值;
根据所述n个特征解码网络分别对应的损失函数值,确定总损失函数值;
根据所述总损失函数值对待训练的所述神经网络解码器进行参数调整,得到训练后的所述神经网络解码器。
在一些实施例中,所述神经网络解码器包括的所述特征提取网络和所述n个特征解码网络,部署在同一块芯片上。
在一些实施例中,部署所述神经网络解码器的芯片包括呈树形结构的多个处理器,任意两个不具有连接关系的处理器具有并行性。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图25,其示出了本申请一个实施例提供的计算机设备的结构示意图。该计算机设备可以是图4所示方案应用场景中的控制设备43。该计算机设备可用于实施上述实施例中提供的基于神经网络的量子纠错解码方法。具体来讲:
所述计算机设备2500包括处理单元2501(如包括CPU和/或GPU)、包括随机存取存储器(Random Access Memory,RAM)2502和只读存储器(Read-Only Memory,ROM)2503的系统存储器2504,以及连接系统存储器2504和处理单元2501的系统总线2505。所述计算机设备2500还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O(Input/Output)系统)2506,和用于存储操作系统2513、应用程序2514和其他程序模块2515的大容量存储设备2507。
所述基本输入/输出系统2506包括有用于显示信息的显示器2508和用于用户输入信息的诸如鼠标、键盘之类的输入设备2509。其中所述显示器2508和输入设备2509都通过连接到 系统总线2505的输入输出控制器2510连接到处理单元2501。所述基本输入/输出系统2506还可以包括输入输出控制器2510以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。
所述大容量存储设备2507通过连接到系统总线2505的大容量存储控制器(未示出)连接到处理单元2501。所述大容量存储设备2507及其相关联的计算机可读介质为计算机设备2500提供非易失性存储。也就是说,所述大容量存储设备2507可以包括诸如硬盘或者CD-ROM(Compact Disc Read-Only Memory,只读光盘)驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM(Erasable Programmable Read-Only Memory,可擦除可编程只读存储器)、EEPROM(Electrically Erasable Programmable Read-Only Memory,电可擦除可编程只读存储器)、闪存或其他固态存储器技术,CD-ROM、DVD(Digital Video Disc,高密度数字视频光盘)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的系统存储器2504和大容量存储设备2507可以统称为存储器。
根据本申请的各种实施例,所述计算机设备2500还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算机设备2500可以通过连接在所述系统总线2505上的网络接口单元2511连接到网络2512,或者说,也可以使用网络接口单元2511来连接到其他类型的网络或远程计算机系统(未示出)。
所述存储器中存储有计算机程序,所述计算机程序经配置以由一个或者一个以上处理器执行,以实现上述实施例提供的基于神经网络的量子纠错解码方法。
在示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序在被计算机设备的处理器执行时实现上述实施例提供的基于神经网络的量子纠错解码方法。在示例性实施例中,上述计算机可读存储介质可以是ROM、RAM、CD-ROM、磁带、软盘和光数据存储设备等。
在示例性实施例中,还提供了一种计算机程序产品,当该计算机程序产品被执行时,其用于实现上述实施例提供的基于神经网络的量子纠错解码方法。
在示例性实施例中,还提供了一种芯片,该芯片包括可编程逻辑电路和/或程序指令,该芯片在计算机设备上运行,用于实现上述实施例提供的基于神经网络的量子纠错解码方法。
可选地,该芯片为FPGA芯片或ASIC芯片。

Claims (20)

  1. 一种基于神经网络的量子纠错解码方法,所述方法由控制设备执行,所述方法包括:
    获取对量子电路进行症状测量得到的错误症状信息;
    使用神经网络解码器从所述错误症状信息中提取特征信息;
    通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果;
    根据所述解码结果,确定所述量子电路的错误结果信息。
  2. 根据权利要求1所述的方法,所述使用神经网络解码器从所述错误症状信息中提取特征信息,包括:
    通过所述神经网络解码器的特征提取网络对所述错误症状信息进行特征提取,得到特征信息;其中,所述神经网络解码器包括所述特征提取网络和n个特征解码网络,n为大于1的整数;
    所述通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果,包括:
    通过所述n个特征解码网络分别对所述特征信息进行解码处理,得到所述n个特征解码网络分别对应的解码结果;其中,所述n个特征解码网络是采用多任务学习的方式进行训练,以具备生成不同解码结果的能力的网络;
    所述根据所述解码结果,确定所述量子电路的错误结果信息,包括:
    根据所述n个特征解码网络分别对应的解码结果,确定所述错误结果信息。
  3. 根据权利要求2所述的方法,所述量子电路包含的量子比特被划分为n个区块,每一个区块中包含至少一个量子比特;
    对于所述n个特征解码网络中的第k个特征解码网络,所述第k个特征解码网络对应的解码结果包括:作用在所述n个区块中的第k个区块所包含的量子比特上的泡利算子,k为小于或等于n的正整数;
    所述根据所述n个特征解码网络分别对应的解码结果,确定所述错误结果信息,包括:
    根据分别作用在所述n个区块上的泡利算子,确定所述错误结果信息。
  4. 根据权利要求3所述的方法,所述错误结果信息指示所述量子电路中发生泡利X错误的量子比特以及发生泡利Z错误的量子比特。
  5. 根据权利要求3或4所述的方法,所述区块的划分与错误之间的关联性有关,同一个区块中包含的量子比特之间产生具有关联性的错误的概率,大于不同区块中包含的量子比特之间产生具有关联性的错误的概率。
  6. 根据权利要求3至5任一所述的方法,在所述神经网络解码器的训练过程中,采用多种不同的区块划分方式对样本量子电路包含的量子比特进行划分,并基于所述多种不同的区块划分方式对所述神经网络解码器进行联合训练;
    在所述神经网络解码器的使用过程中,采用所述多种不同的区块划分方式中的一种区块划分方式,对所述量子电路包含的量子比特进行划分。
  7. 根据权利要求2至6任一所述的方法,所述通过所述n个特征解码网络分别对所述特征信息进行解码处理,得到所述n个特征解码网络分别对应的解码结果,包括:
    通过n1个特征解码网络分别对所述特征信息进行解码处理,得到所述n1个特征解码网络分别对应的解码结果;其中,对于所述n1个特征解码网络中的第i个特征解码网络,所述第i个特征解码网络对应的解码结果包括:与目标错误类型相关的第i项正则症状,i为小于或 等于n1的正整数,所述正则症状是指所述错误症状信息的正则分解结果;
    通过n2个特征解码网络分别对所述特征信息进行解码处理,得到所述n2个特征解码网络分别对应的解码结果;其中,对于所述n2个特征解码网络中的第j个特征解码网络,所述第j个特征解码网络对应的解码结果包括:与所述目标错误类型相关的固定代表元素,j为小于或等于n2的正整数;
    其中,n1与n2之和等于n,且n1与n2均为正整数。
  8. 根据权利要求7所述的方法,所述目标错误类型包括泡利X错误和泡利Z错误,n1等于m1与m2之和,m1与m2均为正整数,且n2等于2;
    所述n1个特征解码网络中的m1个特征解码网络,用于分别对所述特征信息进行解码处理,得到m1项与所述泡利X错误相关的正则症状;
    所述n1个特征解码网络中的m2个特征解码网络,用于分别对所述特征信息进行解码处理,得到m2项与所述泡利Z错误相关的正则症状;
    所述n2个特征解码网络中的一个特征解码网络,用于对所述特征信息进行解码处理,得到与所述泡利X错误相关的固定代表元素;
    所述n2个特征解码网络中的另一个特征解码网络,用于对所述特征信息进行解码处理,得到与所述泡利Z错误相关的固定代表元素;
    所述根据所述n个特征解码网络分别对应的解码结果,确定所述错误结果信息,包括:
    根据与所述泡利X错误相关的固定代表元素,以及m1项与所述泡利X错误相关的正则症状,确定X类错误结果信息,所述X类错误结果信息指示所述量子电路中发生所述泡利X错误的量子比特;
    根据与所述泡利Z错误相关的固定代表元素,以及m2项与所述泡利Z错误相关的正则症状,确定Z类错误结果信息,所述Z类错误结果信息指示所述量子电路中发生所述泡利Z错误的量子比特。
  9. 根据权利要求2至7任一所述的方法,所述特征提取网络包括多个级联的特征提取子网络;其中,第1个特征提取子网络的输入数据包括所述错误症状信息,第s个特征提取子网络的输入数据包括第s-1个特征提取子网络的输出数据,最后一个特征提取子网络的输出数据包括所述特征信息,s为大于1的整数;
    对于所述多个级联的特征提取子网络中的目标特征提取子网络,所述目标特征提取子网络的输入数据被划分为尺度相同的多个输入数据块;
    所述目标特征提取子网络用于对所述多个输入数据块执行多次局域特征提取映射,得到多组映射输出数据;其中,每一次局域特征提取映射用于对所述多个输入数据块中相同位置的区域进行映射处理,得到一组映射输出数据;不同次的局域特征提取映射用于对所述多个输入数据块中不同位置的区域进行映射处理,得到所述多组映射输出数据;
    所述目标特征提取子网络还用于根据所述多组映射输出数据,得到所述目标特征提取子网络的输出数据。
  10. 根据权利要求9所述的方法,所述不同位置的区域之间存在重叠。
  11. 根据权利要求9或10所述的方法,所述目标特征提取子网络包括:至少一个卷积层以及至少一个全连通层;
    所述至少一个卷积层用于对所述多个输入数据块执行所述多次局域特征提取映射,得到所述多组映射输出数据;
    所述至少一个全连通层用于根据所述多组映射输出数据,得到所述目标特征提取子网络的输出数据。
  12. 根据权利要求1至11任一所述的方法,在所述神经网络解码器对已获取的所述错误症状信息进行解码的过程中,并行执行新的错误症状信息的测量采集过程。
  13. 根据权利要求2至12任一所述的方法,所述神经网络解码器的训练过程如下:
    获取样本错误症状信息,以及所述样本错误症状信息样本对应的样本错误结果信息;
    通过待训练的所述神经网络解码器根据所述样本错误症状信息,得到所述n个特征解码网络分别对应的预测解码结果;
    根据所述n个特征解码网络分别对应的预测解码结果,以及基于所述样本错误结果信息确定的所述n个特征解码网络分别对应的标签解码结果,确定所述n个特征解码网络分别对应的损失函数值;
    根据所述n个特征解码网络分别对应的损失函数值,确定总损失函数值;
    根据所述总损失函数值对待训练的所述神经网络解码器进行参数调整,得到训练后的所述神经网络解码器。
  14. 根据权利要求2至13任一所述的方法,所述神经网络解码器包括的所述特征提取网络和所述n个特征解码网络,部署在同一块芯片上。
  15. 根据权利要求1至14任一所述的方法,部署所述神经网络解码器的芯片包括呈树形结构的多个处理器,任意两个不具有连接关系的处理器具有并行性。
  16. 一种基于神经网络的量子纠错解码装置,所述装置包括:
    症状获取模块,用于获取对量子电路进行症状测量得到的错误症状信息;
    特征提取模块,用于使用神经网络解码器从所述错误症状信息中提取特征信息;其中,所述神经网络解码器包括所述特征提取网络和n个特征解码网络,n为大于1的整数;
    特征解码模块,用于通过所述神经网络解码器对所述特征信息进行解码处理,得到解码结果;
    结果确定模块,用于根据所述解码结果,确定所述量子电路的错误结果信息。
  17. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如权利要求1至15任一项所述的方法。
  18. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至15任一项所述的方法。
  19. 一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至15任一项所述的方法。
  20. 一种芯片,所述芯片部署有神经网络解码器,所述神经网络解码器用于实现如权利要求1至15任一项所述的方法。
PCT/CN2023/108856 2022-11-22 2023-07-24 基于神经网络的量子纠错解码方法、装置、设备及芯片 WO2024109128A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211468927.9A CN118070913A (zh) 2022-11-22 2022-11-22 基于神经网络的量子纠错解码方法、装置、设备及芯片
CN202211468927.9 2022-11-22

Publications (1)

Publication Number Publication Date
WO2024109128A1 true WO2024109128A1 (zh) 2024-05-30

Family

ID=91106344

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/108856 WO2024109128A1 (zh) 2022-11-22 2023-07-24 基于神经网络的量子纠错解码方法、装置、设备及芯片

Country Status (2)

Country Link
CN (1) CN118070913A (zh)
WO (1) WO2024109128A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111510157A (zh) * 2020-04-15 2020-08-07 腾讯科技(深圳)有限公司 基于神经网络的量子纠错解码方法、装置及芯片
CN111510158A (zh) * 2020-04-15 2020-08-07 腾讯科技(深圳)有限公司 量子电路的容错纠错解码方法、装置及芯片
CN112734043A (zh) * 2021-01-07 2021-04-30 电子科技大学 一种基于深度学习的分段容错逻辑量子电路解码方法
CN112988451A (zh) * 2021-02-07 2021-06-18 腾讯科技(深圳)有限公司 量子纠错解码系统、方法、容错量子纠错系统及芯片
US20210398621A1 (en) * 2018-11-07 2021-12-23 Kuano Ltd. A quantum circuit based system configured to model physical or chemical systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210398621A1 (en) * 2018-11-07 2021-12-23 Kuano Ltd. A quantum circuit based system configured to model physical or chemical systems
CN111510157A (zh) * 2020-04-15 2020-08-07 腾讯科技(深圳)有限公司 基于神经网络的量子纠错解码方法、装置及芯片
CN111510158A (zh) * 2020-04-15 2020-08-07 腾讯科技(深圳)有限公司 量子电路的容错纠错解码方法、装置及芯片
CN112734043A (zh) * 2021-01-07 2021-04-30 电子科技大学 一种基于深度学习的分段容错逻辑量子电路解码方法
CN112988451A (zh) * 2021-02-07 2021-06-18 腾讯科技(深圳)有限公司 量子纠错解码系统、方法、容错量子纠错系统及芯片

Also Published As

Publication number Publication date
CN118070913A (zh) 2024-05-24

Similar Documents

Publication Publication Date Title
KR102573252B1 (ko) 신경망 기반 양자 에러 정정 복호화 방법 및 장치, 칩
KR102618148B1 (ko) 양자 회로를 위한 결함 허용 및 에러 정정 디코딩 방법 및 장치, 그리고 칩
JP7514297B2 (ja) 量子計算装置のためのパイプライン式ハードウェア復号器
WO2022166199A1 (zh) 量子纠错解码系统、方法、容错量子纠错系统及芯片
JP7039689B2 (ja) 量子誤り訂正
KR20210040248A (ko) 물질의 생성 구조-특성 역 계산 공동 설계
Sekanina Neural architecture search and hardware accelerator co-search: A survey
US11842250B2 (en) Quantum error correction decoding system and method, fault-tolerant quantum error correction system, and chip
CN112633482A (zh) 一种高效宽度图卷积神经网络模型及其训练方法
Duong et al. Quantum neural architecture search with quantum circuits metric and bayesian optimization
Cruise et al. Practical quantum computing: The value of local computation
KR20240049671A (ko) 인공 신경망 모델의 커널을 생성할 수 있는 npu 및 그 방법
Yang et al. Learning to generate 3d training data through hybrid gradient
WO2024109128A1 (zh) 基于神经网络的量子纠错解码方法、装置、设备及芯片
CN116820762A (zh) 一种基于电力边缘芯片的边云协同计算方法
Dash DECPNN: A hybrid stock predictor model using Differential Evolution and Chebyshev Polynomial neural network
Jeong et al. Filter combination learning for CNN model compression
TWI846942B (zh) 用以針對目標性質生成結構的機器學習系統以及方法
Bondarenko Constructing networks of quantum channels for state preparation
Kuikka Predicting the Building Envelope in BIM Models Using Graph Convolutional Neural Networks
Hu et al. An Improved Defect Detection Algorithm for Industrial Products via Lightweight Convolutional Neural Network
Agrawal Hyperparameter Optimization in Machine Learning