WO2021168798A1 - Training method for quantum boltzmann machine, and hybrid computer - Google Patents

Training method for quantum boltzmann machine, and hybrid computer Download PDF

Info

Publication number
WO2021168798A1
WO2021168798A1 PCT/CN2020/077208 CN2020077208W WO2021168798A1 WO 2021168798 A1 WO2021168798 A1 WO 2021168798A1 CN 2020077208 W CN2020077208 W CN 2020077208W WO 2021168798 A1 WO2021168798 A1 WO 2021168798A1
Authority
WO
WIPO (PCT)
Prior art keywords
quantum
sample
layer
loss function
computer
Prior art date
Application number
PCT/CN2020/077208
Other languages
French (fr)
Chinese (zh)
Inventor
张文
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/077208 priority Critical patent/WO2021168798A1/en
Priority to CN202080081890.7A priority patent/CN114730385A/en
Publication of WO2021168798A1 publication Critical patent/WO2021168798A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B82NANOTECHNOLOGY
    • B82YSPECIFIC USES OR APPLICATIONS OF NANOSTRUCTURES; MEASUREMENT OR ANALYSIS OF NANOSTRUCTURES; MANUFACTURE OR TREATMENT OF NANOSTRUCTURES
    • B82Y10/00Nanotechnology for information processing, storage or transmission, e.g. quantum computing or single electron logic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Definitions

  • This application relates to the field of quantum computing, in particular to a training method of a quantum Boltzmann machine and a hybrid computer.
  • Quantum machine learning uses the high parallelism of quantum computing to achieve the purpose of further optimizing traditional machine learning.
  • the quantum Boltzmann machine is a typical quantum machine learning model.
  • the model structure of the quantum Boltzmann machine with supervised learning and the quantum Boltzmann machine with unsupervised learning is not uniform, so it cannot be used for semi-supervised learning.
  • This application provides a method for training a quantum Boltzmann machine and a hybrid computer, which can be used for semi-supervised learning.
  • a training method of a quantum Boltzmann machine includes the following steps: Obtain the first loss function of the quantum Boltzmann machine, where the model structure of the quantum Boltzmann machine includes the first layer and the second layer; the quantum unit of the first layer is used to assign the input of the labeled sample Sample, the quantum unit of the second layer is used to assign the output sample of the labeled sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample; the quantum unit of the first layer is fully connected with the quantum unit of the second layer ;
  • the first loss function ⁇ *the second loss function+ ⁇ *the third loss function, where the second loss function is obtained by calculating the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample.
  • the three loss function is obtained by calculating the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; where ⁇ and ⁇ are constants, and usually it needs to determine the value according to the characteristics of the sample data set.
  • ⁇ and ⁇ are constants, and usually it needs to determine the value according to the characteristics of the sample data set.
  • One example is ⁇ [0,1], ⁇ [0,1]; obtain the first partial derivative of the first loss function with respect to the predetermined parameter of the quantum Boltzmann machine's Hamiltonian, the predetermined parameter includes the quantum Boltz The connection weight of the two quantum units in the Mann machine or the bias of the quantum unit; perform a gradient algorithm on the first partial derivative to update the predetermined parameters to obtain an updated quantum Boltzmann machine, where the updated quantum Boltzmann machine The Hamiltonian uses the updated predetermined parameter.
  • the model structure of the quantum Boltzmann machine includes a first layer and a second layer; among them, the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used for Output samples assigned with labeled samples, or input samples used to assign unlabeled samples in the first layer; quantum units in the first layer are fully connected with quantum units in the second layer; models for supervised and unsupervised learning
  • the structure is the same, and the total number of qubits required is the same.
  • the loss function of the training quantum Boltzmann machine adopts the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample.
  • the quantum Boltzmann machine obtained by training can be adapted to semi-supervised learning according to a certain ratio.
  • the calculation method of the second loss function and the third loss function are also provided; the calculation method of the second loss function is as follows: according to the conditional probability of the output sample under the condition of the labeled sample, the negative The logarithmic conditional likelihood is calculated to obtain the supervised learning loss function; the supervised learning loss function is converted into the second loss function using Gordon-Thompson Golden-Thompson inequality.
  • the calculation method of the third loss function is as follows: perform negative logarithmic conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning; use Gordon for the loss function of unsupervised learning -Thompson Golden-Thompson inequality is converted into the third loss function.
  • the Golden-Thompson inequality conversion is performed on both here.
  • the first partial derivative is expressed as a polynomial
  • the method further includes: determining a predetermined sample from a sample data set, the predetermined sample including the labeled sample or the unlabeled sample; preparing The first quantum state of the predetermined sample; the quantum approximation optimization QAOA algorithm is performed on the first quantum state to obtain the second quantum state; the second partial derivative of the Hamiltonian with respect to the predetermined parameter is measured for the second quantum state as The term of the first partial derivative.
  • the predetermined sample determined from the sample data set can be directly processed by the digital computer, and the subsequent processes that need to be processed in the quantum state can all be completed by the quantum computer.
  • the method further includes: calculating a first average value of the M second partial derivatives obtained by the predetermined sample, and using the first average value as a term of the first partial derivative.
  • M second partial derivatives can be calculated for the same predetermined sample, and the larger the value of M, the higher the calculation accuracy.
  • the method further includes: calculating a second average value of the second partial derivatives corresponding to the N samples obtained in the sample data set, and using the second average value as the term of the first partial derivative.
  • the quantum unit of the first layer when the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the output sample of the labeled sample, the first layer And the second layer is a visible layer; or, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is the visible layer, and the second layer is the hidden layer.
  • the first and second layers of the quantum Boltzmann machine are all visible layers, and the input and output are used as the visible layer, and there is no additional hidden layer.
  • the second layer of the output sample is changed from the visible layer to the hidden layer, and no additional hidden layer is introduced. Ensure the unification of the supervised learning model and the unsupervised learning model.
  • a hybrid computer for implementing the above-mentioned various methods.
  • the hybrid computer includes modules, units, or means corresponding to the foregoing methods, and the modules, units, or means can be implemented by hardware, software, or by hardware executing corresponding software.
  • the hardware or software includes one or more modules or units corresponding to the above-mentioned functions; for example, a hybrid computer may include a quantum computer and a digital computer for implementing the above-mentioned method.
  • a hybrid computer including: a processor and a memory; the memory is used to store computer instructions, and when the processor executes the instructions, the hybrid computer can execute the method of any one of the foregoing aspects.
  • a hybrid computer including: a processor; the processor is configured to be coupled to a memory, and after reading an instruction in the memory, execute the method according to any one of the above aspects according to the instruction.
  • a computer-readable storage medium stores instructions that, when run on a computer, enable the computer to execute the method in any of the above aspects.
  • a computer program product containing instructions which when running on a computer, enables the computer to execute the method in any of the above aspects.
  • a hybrid computer for example, the hybrid computer may be a chip or a chip system
  • the hybrid computer includes a processor for implementing the functions involved in any of the above aspects.
  • the hybrid computer further includes a memory for storing necessary program instructions and data.
  • the hybrid computer is a chip system, it may be composed of chips, or may include chips and other discrete devices.
  • FIG. 1 is a schematic structural diagram of a hybrid computer provided by an embodiment of this application.
  • FIG. 2 is a schematic flowchart of a training method of a quantum Boltzmann machine provided by an embodiment of the application;
  • FIG. 3 is a schematic structural diagram of a quantum Boltzmann machine provided by an embodiment of the application.
  • FIG. 4 is a schematic structural diagram of a quantum Boltzmann machine provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a hybrid computer provided by another embodiment of this application.
  • Supervised learning The training data uses labeled samples, and the training data has both features and labels. Usually the input samples are the existing features and the output samples are the labels. Through training, the machine can find the features and labels by itself When facing data with only features and no labels, the label can be judged.
  • Unsupervised learning The training data uses unlabeled samples, usually only input samples, and the label information of the input samples is unknown. The goal is to reveal the inherent properties and laws of the data through the learning of unlabeled samples, which is further data Analysis provides the basis. Among such learning tasks, clustering is the most studied and widely used. Other unsupervised algorithms include density estimation, anomaly detection, and so on.
  • Semi-supervised learning The training data contains both labeled samples and unlabeled samples, without manual intervention, so that the machine does not rely on external interaction and automatically uses unlabeled samples to improve learning performance, which is semi-supervised learning.
  • Quantum computing uses the principles of quantum mechanics to perform general-purpose calculations.
  • Classical computers digital computers
  • Quantum computing is based on the manipulation of qubits. Each qubit can be in the superposition of the quantum state
  • N qubits can be in the superposition state of 2N quantum states (
  • Boltzmann machine is a neural network model. It contains two sets of variables: hidden variables and visible variables, all variables are binary (take 0 or 1).
  • a Boltzmann machine with N variables satisfies the following three properties: 1. All variables (samples) can be represented by a binary random vector x ⁇ 0,1 ⁇ N ; 2. All variables are fully connected Yes, the value of each variable depends on all other variables; 3. The influence relationship between the variables is pairwise symmetrical.
  • the loss function used in Boltzmann machine parameter training is negative log likelihood
  • v represents a visible variable
  • P v represents the marginal probability of the visible variable in the model
  • P v (1/Z) ⁇ h exp(-E(x)), where h represents the hidden variable.
  • the parameter update formula of the Boltzmann machine is usually not accurately calculated, and it needs to be approximated by the Gibbs sampling method.
  • the quantum Boltzmann machine can be regarded as a quantum version of the classic Boltzmann machine.
  • the variable in the quantum Boltzmann machine is qubit.
  • the energy function in the classical Boltzmann machine is replaced by the conceptual hamiltonian in quantum mechanics.
  • the Hamiltonian is a quantum mechanics operator, which can be represented by a matrix. For a system of N qubits, the Hamiltonian is a matrix with a dimension of 2 N ⁇ 2 N.
  • the eigenvalue of the Hamiltonian is energy, so if the Hamiltonian only has diagonal elements (e.g.
  • x >), where Z tr[exp(-H)] is the partition function,
  • P v (1/Z)tr[ ⁇ v exp(-H)]
  • tr() represents the trace of the matrix
  • h represents a hidden variable
  • I the identity matrix.
  • Quantum approximate optimization algorithm is a quantum algorithm, specifically it is a quantum-classical hybrid algorithm that combines classical parameter optimization and quantum computing.
  • QAOA quantum approximate optimization algorithm
  • QAOA involves two operators, called the mixed Hamiltonian H M and the cost Hamiltonian H C.
  • QAOA specifically includes the following steps: first prepare a quantum state, its density operator is exp(- ⁇ H M )/tr[exp(- ⁇ H M )], where ⁇ is a constant; then perform the operator on the quantum state Where v l and ⁇ l are a series of constants to be optimized and are given random initial values; then the average value of the operator H C is measured ⁇ H C >, which is a numerical value; through classical computers and classical optimization methods such as gradient descent Method, etc., optimize v l and ⁇ l until the minimum value of ⁇ H C > is obtained, at this time v l and ⁇ l take values respectively with Then the operator Acting on the quantum state exp(- ⁇ H M )/tr[exp(- ⁇ H M )] will get the quantum state with the density operator exp(- ⁇ H C )/tr[exp(- ⁇ H C )].
  • At least one refers to one or more, and “multiple” refers to two or more.
  • And/or describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the following at least one item (item) or similar expressions refers to any combination of these items, including any combination of single item (item) or plural items (item).
  • At least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • the embodiments of the present application use words such as “first” and “second” to distinguish objects with similar names or functions or functions. Those skilled in the art can understand the words “first”, “second” and the like. There is no limit to the number and execution order.
  • an embodiment of the present application provides a hybrid computer 01, which includes a computing subsystem 20 and a digital computer 10 coupled to the computing subsystem 20.
  • the computing subsystem 20 can provide professional functions.
  • the computing subsystem 20 is a quantum computer
  • the digital computer 10 is a classical computer.
  • the quantum computer is a quantum annealing and/or adiabatic quantum computer.
  • the quantum computer is a gate-model quantum computer or another suitable type of quantum computer.
  • the digital computer 10 includes one or more digital processors 101, a communication line 102, and at least one communication interface (in FIG. 1 it is only an example that includes a communication interface 104 and a digital processor 101 as an example).
  • the memory 103 may also be included.
  • the digital processor 101 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. Integrated circuits.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • the communication line 102 may include a path for connecting different components.
  • the communication interface 104 may be a transceiver module for communicating with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc.
  • the transceiver module may be a device such as a transceiver or a transceiver.
  • the communication interface 104 may also be a transceiver circuit located in the digital processor 101 to implement signal input and signal output of the processor.
  • the memory 103 may be a device having a storage function.
  • it can be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions
  • Dynamic storage devices can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage ( Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be stored by a computer Any other media taken, but not limited to this.
  • the memory can exist independently and is connected to the processor through the communication line 102.
  • the memory can also be integrated with the digital processor.
  • the memory 103 is used to store computer-executable instructions for executing the solution of the present application, and the digital processor 101 controls the execution.
  • the digital processor 101 is configured to execute computer execution instructions stored in the memory 103, so as to implement other classical digital processing calculations besides quantum calculation in the training method provided in the embodiment of the present application.
  • the communication interface 104 is responsible for communicating with other devices, which is not specifically limited in the embodiment of the present application.
  • the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
  • the digital processor 101 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 1.
  • the digital computer 10 may include multiple digital processors, such as the digital processor 101 and the digital processor 108 in FIG. 1. Each of these digital processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the digital processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
  • the digital computer 10 may further include an output device 105 and an input device 106.
  • the output device 105 communicates with the digital processor 101 and can display information in a variety of ways.
  • the output device 105 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector), etc.
  • the input device 106 communicates with the digital processor 101 and can receive user input in a variety of ways.
  • the input device 106 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the above-mentioned digital computer 10 may be a general-purpose device or a special-purpose device.
  • Those skilled in the relevant art will understand that when properly configured or programmed to form a dedicated device, and/or when communicatively coupled to control a quantum computer, other digital computer configurations can be used to practice the system and method of the present invention, including handheld Devices, multi-processor systems, microprocessor-based or programmable consumer electronic devices, personal computers (“PCs”), network PCs, minicomputers, mainframe computers, etc.
  • the digital computer 10 will sometimes be referred to in the singular form, but this is not intended to limit the application to a single digital computer.
  • the system and method of the present invention can also be practiced in a distributed computing environment, where tasks or sets of instructions are performed or executed by remote processing devices linked through a communication network.
  • computer-readable or processor-readable instructions sometimes referred to as program modules
  • application programs, and/or data can be located in both local memory storage devices and remote memory storage devices (for example, non-transitory computer storage devices). Read or processor readable media) in both.
  • the digital computer 10 is coupled to the computing subsystem 20 through the controller 109, and the controller 109 is coupled to the communication line 102 in the digital computer 10.
  • the memory 103 may store a set of computer-readable or processor-readable computing instructions (ie, computing modules) to perform pre-processing, co-processing, and post-processing on the computing subsystem 20.
  • the memory 103 can store a set of analog computers or quantum computer interface modules operable to interact with the computing subsystem 20.
  • the memory 103 may store related instructions for training of the quantum Boltzmann machine to provide programs and parameters for the operation of the computing subsystem 20 as the quantum Boltzmann machine.
  • the training method of the quantum Boltzmann machine provided by the embodiment of the present application can be implemented on the digital computer 10 and the computing subsystem 20.
  • the computing subsystem 20 may be set in an isolated environment (not shown).
  • the environment can shield the internal components of the quantum computer from heat, magnetic fields, and the like.
  • the computing subsystem 20 may include a quantum processor 201.
  • the quantum processor 201 includes programmable elements such as qubits, couplers, and other devices.
  • the qubits are read through the read control system 202. These results are fed to the memory 103 of the digital computer 10.
  • the qubit is controlled via the qubit control system 203.
  • the coupler is controlled via the coupler control system 204.
  • the qubit control system 203 and the coupler control system 204 are used to implement quantum annealing as described herein on the quantum processor 201.
  • the quantum processor may be designed to perform gate-level model quantum calculations. Alternatively or additionally, the quantum processor may be designed to perform quantum annealing and/or adiabatic quantum calculations.
  • the embodiment of the present application provides a method for training a quantum Boltzmann machine, as shown in FIG. 2, including the following steps:
  • the model structure of the quantum Boltzmann machine includes a first layer and a second layer; the quantum unit of the first layer is used to assign the input sample of the marked sample, and the quantum unit of the second layer is used to assign the value of the marked sample The output sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample.
  • the quantum unit of the first layer is fully connected with the quantum unit of the second layer.
  • the quantum unit of the first layer is used to assign the input sample of the labeled sample
  • the quantum unit of the second layer is used to assign the labeled sample.
  • the output samples of the sample, the first layer and the second layer are visible layers
  • the quantum unit of the first layer is used to assign the value of the unlabeled sample Input samples, the first layer is the visible layer, and the second layer is the hidden layer.
  • the first and second layers of the quantum Boltzmann machine are all visible layers, and the input and output are both visible layers, and there is no additional hidden layer.
  • the output variable is changed from the visible layer to the hidden layer, and no additional hidden layer is introduced.
  • the model structure when performing supervised learning and unsupervised learning is the same, and the total number of qubits required is the same.
  • the form of the Hamiltonian of the quantum Boltzmann machine is not limited. As explained in the background introduction, it can be that only diagonal elements exist (e.g. It can also have non-diagonal elements (e.g.
  • the first loss function ⁇ *the second loss function+ ⁇ *the third loss function, where the second loss function is obtained by calculating the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the third The loss function is obtained by calculating the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; exemplary, for the convenience of calculation, the second loss function is obtained in the following manner: according to the input sample condition of the labeled sample The conditional probability of the output sample is calculated by negative logarithmic conditional likelihood to obtain the loss function of supervised learning; the loss function of supervised learning is converted into the second loss function by using Gordon-Thompson Golden-Thompson inequality.
  • the third loss function is obtained in the following way: according to the marginal probability of the input sample of the unlabeled sample, the negative logarithmic conditional likelihood calculation is performed to obtain the loss function of the unsupervised learning; the loss function of the unsupervised learning is used Gordon-Thompson Golden- The Thompson inequality is converted into the third loss function.
  • the method for obtaining the first loss function is described as follows:
  • the specific form of the Hamiltonian H of the quantum Boltzmann machine is not limited.
  • For labeled samples include input sample x and output sample y.
  • the marginal probability of the input sample x in the quantum Boltzmann machine model is The joint probability of input sample x and output sample y is Then the conditional probability of output sample y under the condition that the input sample is x is Among them, for a quantum Boltzmann machine with a total number of samples N, the Hamiltonian H is a matrix of 2N ⁇ 2N;
  • the loss function is negative log likelihood, Where D unlab represents a data set of unlabeled samples, Represents the probability of x in the data set of unlabeled samples.
  • the overall loss function of semi-supervised learning (that is, the first loss function) is obtained by adding the above two loss functions in a certain proportion, namely Among them, ⁇ and ⁇ are unrestricted constants.
  • is 0.
  • is not 0, it is used as the loss function of unsupervised learning.
  • is not 0, when ⁇ is 0, it is used as the loss function of supervised learning.
  • both ⁇ and ⁇ are not 0, it is used as the loss function of semi-supervised learning.
  • represent any parameter in the Hamiltonian of the quantum Boltzmann machine (e.g. In w ij and b i ), there are and
  • polynomial includes the following four terms:
  • the four items of are respectively calculated by a hybrid computer of a quantum computer and a classical computer. Specifically, a method for calculating each item in the first partial derivative is provided:
  • S01. Determine a predetermined sample from a sample data set, where the predetermined sample includes a labeled sample or an unlabeled sample.
  • it also includes S05, calculating the first average value of the M second partial derivatives obtained from the predetermined sample, and using the first average value as the term of the first partial derivative.
  • S05 calculating the first average value of the M second partial derivatives obtained from the predetermined sample, and using the first average value as the term of the first partial derivative. The larger the value of M, the higher the calculation accuracy.
  • S06 calculates the second average value of the second partial derivatives corresponding to the N samples obtained in the sample data set, and uses the second average value as the first partial derivative Item.
  • step S01 can be calculated by a digital computer, and S02-S04 can be calculated by a quantum computer; in step S05, the first average value of M second partial derivatives obtained by a predetermined sample can be calculated by a digital computer, and In step S05, it is necessary to repeat the steps S02-S04 for each predetermined sample obtained.
  • step S06 the second average of the second partial derivatives corresponding to the N samples obtained in the sample data set can be calculated by a digital computer, and in step S06, steps S02-S04 need to be repeated for each sample. Or repeat steps S02-S05.
  • the quantum state of the qubit corresponding to the sample y i is prepared according to the density matrix exp(- ⁇ H M )/tr[exp(- ⁇ H M )], and the sample x i corresponds to The quantum state of the qubit is prepared as a quantum state
  • the mixed Hamiltonian is in Is a matrix with a dimension of 2 N ⁇ 2 N , N is the total number of input samples x and output samples y, and is also the total number of qubits required to execute the QAOA algorithm, n is the number of input samples y, Then follow the remaining steps of the QAOA algorithm to complete the execution of the QAOA algorithm. After QAOA is completed, the quantum state will be obtained Operator to this quantum state Measurement, get
  • the quantum state of the qubit corresponding to the sample x i is prepared as a quantum state
  • the quantum state of the qubit corresponding to the sample y is prepared according to the density matrix exp(- ⁇ H M )/tr[exp(- ⁇ H M )].
  • step S2 repeat step S1, get the results of multiple measurements, and then calculate
  • the QAOA algorithm can be executed.
  • the sample When the QAOA algorithm is executed, the sample must first be prepared to a quantum state related to the mixed Hamiltonian. But calculate In, the quantum state of sample x is prepared in advance as
  • the acquisition process can be completely implemented on a digital computer.
  • step 103 the gradient descent method or the ascending method is specifically applied to the first partial derivative to update the predetermined parameters, and the model training is completed.
  • the model structure of the quantum Boltzmann machine includes a first layer and a second layer; among them, the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used for Output samples assigned with labeled samples, or input samples used to assign unlabeled samples in the first layer; quantum units in the first layer are fully connected with quantum units in the second layer; models for supervised and unsupervised learning
  • the structure is the same, and the total number of qubits required is the same.
  • the loss function of the training quantum Boltzmann machine adopts the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample.
  • the quantum Boltzmann machine obtained by training can be adapted to semi-supervised learning according to a certain ratio.
  • the methods and/or steps implemented by the hybrid computer can also be implemented by components (such as chips or circuits) that can be used in the hybrid computer.
  • an embodiment of the present application also provides a hybrid computer, which is used to implement the above-mentioned various methods.
  • the hybrid computer includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiments of the present application may divide the functional modules of the hybrid computer according to the foregoing method embodiments.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 5 shows a schematic diagram of the structure of a hybrid computer 5.
  • the hybrid computer includes: a digital computing unit 51 and a quantum computing unit 52.
  • the digital calculation unit 51 is further configured to perform negative logarithmic conditional likelihood calculation according to the conditional probability of the output sample under the condition of the input sample of the labeled sample, and obtain the loss function of supervised learning;
  • the loss function of the supervised learning is converted into the second loss function using the Golden-Thompson Golden-Thompson inequality.
  • the digital calculation unit 51 is further configured to perform negative log conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning;
  • the function is converted into the third loss function using the Gordon-Thompson Golden-Thompson inequality.
  • the first partial derivative is expressed as a polynomial
  • the digital calculation unit 51 is configured to determine a predetermined sample from a sample data set, and the predetermined sample includes the labeled sample or the unlabeled sample
  • the quantum The calculation unit 52 is configured to prepare the first quantum state of the predetermined sample determined by the digital calculation unit; execute the QAOA algorithm on the first quantum state to obtain the second quantum state; measure the Hamiltonian for the second quantum state
  • the second partial derivative of the quantity with respect to a predetermined parameter is used as the term of the first partial derivative.
  • the digital calculation unit 51 is further configured to calculate a first average value of the M second partial derivatives obtained by the predetermined sample, and use the first average value as a term of the first partial derivative .
  • the digital calculation unit 51 is further configured to calculate a second average value of the second partial derivatives corresponding to the N samples obtained in a predetermined sample set, and use the second average value as the first partial derivative Item.
  • the quantum unit of the first layer is used to assign the input sample of the labeled sample
  • the quantum unit of the second layer is used to assign the output sample of the labeled sample
  • the first layer and the first layer The second layer is the visible layer; or, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is the visible layer, and the second layer is the hidden layer.
  • the hybrid computer is presented in the form of dividing various functional modules in an integrated manner.
  • the "module” here can refer to a specific ASIC, circuit, processor and memory that executes one or more software or firmware programs, integrated logic circuit, and/or other devices that can provide the above-mentioned functions.
  • the hybrid computer can take the form of the hybrid computer shown in FIG. 1.
  • the digital processor 101 and the computing subsystem 20 in the hybrid computer 01 shown in FIG. 1 can call the computer execution instructions stored in the memory 103, so that the hybrid computer 01 executes the method in the foregoing method embodiment; the computing subsystem 20 It can be a quantum computer.
  • the function/implementation process of the digital computing unit 51 in FIG. 5 can be realized by the digital computer in the hybrid computer 01 shown in FIG.
  • the function/implementation process of the quantum computing unit 52 can be implemented by the quantum computer in the hybrid computer 01 shown in FIG. Since the hybrid computer 01 provided in this embodiment can execute the above-mentioned method, the technical effects that can be obtained can refer to the above-mentioned method embodiment, which will not be repeated here.
  • the embodiments of the present application also provide a hybrid computer (for example, the hybrid computer may be a chip or a chip system), the hybrid computer includes a processor and an interface, and the processor is used to read instructions to perform any of the above methods.
  • the hybrid computer also includes memory.
  • the memory is used to store necessary program instructions and data, and the processor can call the program code stored in the memory to instruct the hybrid computer to execute the method in any of the foregoing method embodiments.
  • the memory may not be in the computing device.
  • the hybrid computer is a chip system, it may be composed of chips, or may include chips and other discrete devices, which are not specifically limited in the embodiments of the present application.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or includes one or more data storage devices such as servers, data centers, etc. that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the computer may include the aforementioned device.

Abstract

Provided are a training method for a quantum Boltzmann machine, and a hybrid computer, relating to the field of quantum computing. The method can be applied to semi-supervised learning, and comprises: acquiring a first loss function of a quantum Boltzmann machine; acquiring a first partial derivative of the first loss function with respect to a predetermined parameter of the Hamiltonian of the quantum Boltzmann machine, wherein the predetermined parameter comprises the connection weight of two quantum units in the quantum Boltzmann machine or the bias of the quantum units; and executing a gradient algorithm on the first partial derivative to update the predetermined parameter, and acquiring an updated quantum Boltzmann machine, wherein the Hamiltonian of the updated quantum Boltzmann machine uses the updated predetermined parameter.

Description

一种量子玻尔兹曼机的训练方法及混合计算机A training method of quantum Boltzmann machine and hybrid computer 技术领域Technical field
本申请涉及量子计算领域,尤其涉及一种量子玻尔兹曼机的训练方法及混合计算机。This application relates to the field of quantum computing, in particular to a training method of a quantum Boltzmann machine and a hybrid computer.
背景技术Background technique
量子机器学习借助量子计算的高并行性,实现进一步优化传统机器学习的目的。其中,量子玻尔兹曼机是一种典型的量子机器学习模型。目前,有监督学习的量子玻尔兹曼机与无监督学习的量子玻尔兹曼机的模型结构是不统一的,因此不能用于半监督学习。Quantum machine learning uses the high parallelism of quantum computing to achieve the purpose of further optimizing traditional machine learning. Among them, the quantum Boltzmann machine is a typical quantum machine learning model. At present, the model structure of the quantum Boltzmann machine with supervised learning and the quantum Boltzmann machine with unsupervised learning is not uniform, so it cannot be used for semi-supervised learning.
发明内容Summary of the invention
本申请提供一种量子玻尔兹曼机的训练方法及混合计算机,能够用于半监督学习。This application provides a method for training a quantum Boltzmann machine and a hybrid computer, which can be used for semi-supervised learning.
为达到上述目的,本申请实施例采用如下技术方案:In order to achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of this application:
第一方面,提供一种量子玻尔兹曼机的训练方法。包括如下步骤:获取量子玻尔兹曼机的第一损失函数,其中量子玻尔兹曼机的模型结构包含第一层和第二层;第一层的量子单元用于赋值有标记样本的输入样本,第二层的量子单元用于赋值有标记样本的输出样本;或者第一层的量子单元用于赋值无标记样本的输入样本;第一层的量子单元与第二层的量子单元全连接;第一损失函数=α*第二损失函数+β*第三损失函数,其中第二损失函数由有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然计算获得,第三损失函数由所述无标记样本的输入样本的边际概率的负对数条件似然计算获得;其中α、β为常数,通常它需要根据样本数据集的特征来决定取值,一种示例是α∈[0,1],β∈[0,1];获取第一损失函数对量子玻尔兹曼机的哈密顿量的预定参数的第一偏导数,预定参数包括所述量子玻尔兹曼机中两个量子单元的连接权重或者量子单元的偏置;对第一偏导数执行梯度算法更新预定参数,获取更新的量子玻尔兹曼机,其中,更新的量子玻尔兹曼机的哈密顿量使用更新后的预定参数。在上述方案中,由于量子玻尔兹曼机的模型结构包括第一层与第二层;其中,第一层的量子单元用于赋值有标记样本的输入样本,第二层的量子单元用于赋值有标记样本的输出样本,或者第一层用于赋值无标记样本的输入样本;第一层的量子单元与第二层的量子单元全连接;在进行有监督学习和无监督学习时的模型结构是一致的,所需要的总的qubit数是一致的。并且,训练量子玻尔兹曼机的损失函数采用有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然、以及无标记样本的输入样本的边际概率的负对数条件似然按照一定比例相加获取,使得训练获得的量子玻尔兹曼机能够适应半监督学习。In the first aspect, a training method of a quantum Boltzmann machine is provided. It includes the following steps: Obtain the first loss function of the quantum Boltzmann machine, where the model structure of the quantum Boltzmann machine includes the first layer and the second layer; the quantum unit of the first layer is used to assign the input of the labeled sample Sample, the quantum unit of the second layer is used to assign the output sample of the labeled sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample; the quantum unit of the first layer is fully connected with the quantum unit of the second layer ; The first loss function=α*the second loss function+β*the third loss function, where the second loss function is obtained by calculating the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample. The three loss function is obtained by calculating the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; where α and β are constants, and usually it needs to determine the value according to the characteristics of the sample data set. One example is α∈[0,1], β∈[0,1]; obtain the first partial derivative of the first loss function with respect to the predetermined parameter of the quantum Boltzmann machine's Hamiltonian, the predetermined parameter includes the quantum Boltz The connection weight of the two quantum units in the Mann machine or the bias of the quantum unit; perform a gradient algorithm on the first partial derivative to update the predetermined parameters to obtain an updated quantum Boltzmann machine, where the updated quantum Boltzmann machine The Hamiltonian uses the updated predetermined parameter. In the above solution, the model structure of the quantum Boltzmann machine includes a first layer and a second layer; among them, the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used for Output samples assigned with labeled samples, or input samples used to assign unlabeled samples in the first layer; quantum units in the first layer are fully connected with quantum units in the second layer; models for supervised and unsupervised learning The structure is the same, and the total number of qubits required is the same. In addition, the loss function of the training quantum Boltzmann machine adopts the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample. However, the quantum Boltzmann machine obtained by training can be adapted to semi-supervised learning according to a certain ratio.
在一种可能的设计中,还提供了第二损失函数以及第三损失函数的计算方式;其中第二损失函数的计算方式如下:根据有标记样本的输入样本条件下输出样本的条件概率进行负对数条件似然计算,获取有监督学习的损失函数;对有监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第二损失函数。第三损失函数的计算方式如下:根据所述无标记样本的输入样本的边际概率进行负对数条件似然计算,获取无监督学习的损失函数;对所述无监督学习的损失函数利用高登-汤普森 Golden-Thompson不等式转换为所述第三损失函数。其中,由于通过似然计算获的有监督学习的损失函数以及无监督学习的损失函数的形式会增加后续处理的计算复杂度,因此这里将两者都进行Golden-Thompson不等式转换。In a possible design, the calculation method of the second loss function and the third loss function are also provided; the calculation method of the second loss function is as follows: according to the conditional probability of the output sample under the condition of the labeled sample, the negative The logarithmic conditional likelihood is calculated to obtain the supervised learning loss function; the supervised learning loss function is converted into the second loss function using Gordon-Thompson Golden-Thompson inequality. The calculation method of the third loss function is as follows: perform negative logarithmic conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning; use Gordon for the loss function of unsupervised learning -Thompson Golden-Thompson inequality is converted into the third loss function. Among them, since the form of the supervised learning loss function and the unsupervised learning loss function obtained through likelihood calculation will increase the computational complexity of the subsequent processing, the Golden-Thompson inequality conversion is performed on both here.
在一种可能的设计中,所述第一偏导数表示为多项式,所述方法还包括:从样本数据集中确定预定样本,所述预定样本包括所述有标记样本或所述无标记样本;制备所述预定样本的第一量子态;对所述第一量子态执行量子近似优化QAOA算法,获得第二量子态;对所述第二量子态测量哈密顿量对预定参数的第二偏导数作为所述第一偏导数的项。在上述过程中,从样本数据集中确定预定样本可由数字计算机直接处理,后续需要在量子态处理的过程可以全部由量子计算机完成。In a possible design, the first partial derivative is expressed as a polynomial, and the method further includes: determining a predetermined sample from a sample data set, the predetermined sample including the labeled sample or the unlabeled sample; preparing The first quantum state of the predetermined sample; the quantum approximation optimization QAOA algorithm is performed on the first quantum state to obtain the second quantum state; the second partial derivative of the Hamiltonian with respect to the predetermined parameter is measured for the second quantum state as The term of the first partial derivative. In the above process, the predetermined sample determined from the sample data set can be directly processed by the digital computer, and the subsequent processes that need to be processed in the quantum state can all be completed by the quantum computer.
在一种可能的设计中,还包括:对所述预定样本获取的M次所述第二偏导数求第一平均值,将所述第一平均值作为所述第一偏导数的项。为了提高计算的精度,可以对同一个预定样本求取M次第二偏导数,并且M的值越大,计算精度越高。In a possible design, the method further includes: calculating a first average value of the M second partial derivatives obtained by the predetermined sample, and using the first average value as a term of the first partial derivative. In order to improve the accuracy of calculation, M second partial derivatives can be calculated for the same predetermined sample, and the larger the value of M, the higher the calculation accuracy.
在一种可能的设计中,还包括:对样本数据集中获取的N个样本对应的第二偏导数求第二平均值,将所述第二平均值作为所述第一偏导数的项。In a possible design, the method further includes: calculating a second average value of the second partial derivatives corresponding to the N samples obtained in the sample data set, and using the second average value as the term of the first partial derivative.
在一种可能的设计中,所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本时,所述第一层以及所述第二层为可见层;或者,所述第一层的量子单元用于赋值无标记样本的输入样本时,所述第一层为可见层,所述第二层为隐藏层。这样,对于有标记样本的有监督学习,量子玻尔兹曼机的第一层以及第二层是全部是可见层,输入与输出共同作为可见层,没有额外的隐藏层。对于无标记样本的无监督学习,在前面的模型基础上,将输出样本的第二层由可见层改为隐藏层,也不再额外引入隐藏层。保证了有监督学习的模型和无监督学习的模型的统一。In a possible design, when the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the output sample of the labeled sample, the first layer And the second layer is a visible layer; or, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is the visible layer, and the second layer is the hidden layer. In this way, for the supervised learning of labeled samples, the first and second layers of the quantum Boltzmann machine are all visible layers, and the input and output are used as the visible layer, and there is no additional hidden layer. For unsupervised learning of unlabeled samples, based on the previous model, the second layer of the output sample is changed from the visible layer to the hidden layer, and no additional hidden layer is introduced. Ensure the unification of the supervised learning model and the unsupervised learning model.
第二方面,提供了一种混合计算机用于实现上述各种方法。该混合计算机包括实现上述方法相应的模块、单元、或手段(means),该模块、单元、或means可以通过硬件实现,软件实现,或者通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块或单元;例如混合计算机可以包括用于实现上述方法的量子计算机以及数字计算机。In the second aspect, a hybrid computer is provided for implementing the above-mentioned various methods. The hybrid computer includes modules, units, or means corresponding to the foregoing methods, and the modules, units, or means can be implemented by hardware, software, or by hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the above-mentioned functions; for example, a hybrid computer may include a quantum computer and a digital computer for implementing the above-mentioned method.
第三方面,提供了一种混合计算机,包括:处理器和存储器;该存储器用于存储计算机指令,当该处理器执行该指令时,以使该混合计算机执行上述任一方面的方法。In a third aspect, a hybrid computer is provided, including: a processor and a memory; the memory is used to store computer instructions, and when the processor executes the instructions, the hybrid computer can execute the method of any one of the foregoing aspects.
第四方面,提供了一种混合计算机,包括:处理器;处理器用于与存储器耦接,并读取存储器中的指令之后,根据指令执行如上述任一方面的方法。In a fourth aspect, a hybrid computer is provided, including: a processor; the processor is configured to be coupled to a memory, and after reading an instruction in the memory, execute the method according to any one of the above aspects according to the instruction.
第五方面,提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机可以执行上述任一方面的方法。In a fifth aspect, a computer-readable storage medium is provided. The computer-readable storage medium stores instructions that, when run on a computer, enable the computer to execute the method in any of the above aspects.
第六方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机可以执行上述任一方面的方法。In a sixth aspect, a computer program product containing instructions is provided, which when running on a computer, enables the computer to execute the method in any of the above aspects.
第七方面,提供了一种混合计算机(例如,该混合计算机可以是芯片或芯片系统),该混合计算机包括处理器,用于实现上述任一方面中所涉及的功能。在一种可能的设计中,该混合计算机还包括存储器,该存储器,用于保存必要的程序指令和数据。该混合计算机是芯片系统时,可以由芯片构成,也可以包含芯片和其他分立器件。In a seventh aspect, a hybrid computer is provided (for example, the hybrid computer may be a chip or a chip system), and the hybrid computer includes a processor for implementing the functions involved in any of the above aspects. In a possible design, the hybrid computer further includes a memory for storing necessary program instructions and data. When the hybrid computer is a chip system, it may be composed of chips, or may include chips and other discrete devices.
其中,第二方面至第七方面中任一种设计方式所带来的技术效果可参见上述第一方面中不同设计方式所带来的技术效果,此处不再赘述。Among them, the technical effects brought about by any one of the design methods of the second aspect to the seventh aspect can be referred to the technical effects brought about by the different design methods of the above-mentioned first aspect, which will not be repeated here.
附图说明Description of the drawings
图1为本申请实施例提供的一种混合计算机的结构示意图;FIG. 1 is a schematic structural diagram of a hybrid computer provided by an embodiment of this application;
图2为本申请的实施例提供的一种量子玻尔兹曼机的训练方法的流程示意图;2 is a schematic flowchart of a training method of a quantum Boltzmann machine provided by an embodiment of the application;
图3为本申请的实施例提供的一种量子玻尔兹曼机的结构示意图;FIG. 3 is a schematic structural diagram of a quantum Boltzmann machine provided by an embodiment of the application;
图4为本申请的实施例提供的一种量子玻尔兹曼机的结构示意图;4 is a schematic structural diagram of a quantum Boltzmann machine provided by an embodiment of the application;
图5为本申请的另一实施例提供的一种混合计算机的结构示意图。FIG. 5 is a schematic structural diagram of a hybrid computer provided by another embodiment of this application.
具体实施方式Detailed ways
首先对本申请的实施例用到的技术术语描述如下:First, the technical terms used in the embodiments of this application are described as follows:
有监督学习:训练数据采用有标记样本,训练数据既有特征(feature)又有标签(label),通常输入样本为既有特征,输出样本为标签,通过训练,让机器可以自己找到特征和标签之间的联系,在面对只有特征没有标签的数据时,可以判断出标签。Supervised learning: The training data uses labeled samples, and the training data has both features and labels. Usually the input samples are the existing features and the output samples are the labels. Through training, the machine can find the features and labels by itself When facing data with only features and no labels, the label can be judged.
无监督学习(unsupervised learning):训练数据采用无标记样本,通常仅有输入样本,输入样本的标记信息未知,目标是通过对无标记样本的学习来揭示数据的内在性质及规律,为进一步的数据分析提供基础,此类学习任务中研究最多、应用最广的是"聚类"(clustering),其他无监督算法还有:密度估计(density estimation)、异常检测(anomaly detection)等。Unsupervised learning: The training data uses unlabeled samples, usually only input samples, and the label information of the input samples is unknown. The goal is to reveal the inherent properties and laws of the data through the learning of unlabeled samples, which is further data Analysis provides the basis. Among such learning tasks, clustering is the most studied and widely used. Other unsupervised algorithms include density estimation, anomaly detection, and so on.
半监督学习:训练数据同时包含有标记样本和无标记样本,不需要人工干预,让机器不依赖外界交互、自动地利用无标记样本来提升学习性能,就是半监督学习。Semi-supervised learning: The training data contains both labeled samples and unlabeled samples, without manual intervention, so that the machine does not rely on external interaction and automatically uses unlabeled samples to improve learning performance, which is semi-supervised learning.
量子计算,量子计算是利用量子力学原理来进行通用计算。经典计算机(数字计算机)使用0和1以二进制对数据进行编码、存储和处理,每个比特取值为0或者1。而量子计算则基于对量子比特(qubit)的操控,每个量子比特都可以处于量子态|0>态和|1>态的叠加态。N个量子比特可以处于2N个量子态(|0...0>、|0...1>、…、|1...0>、|1...1>态)的叠加态(如
Figure PCTCN2020077208-appb-000001
(|0...0>+|0...1>+...+|1...0>+|1...1>)),对叠加态的操控相当于该操作同时作用在这2N个状态上,使得量子计算机具有强大的量子并行计算能力。
Quantum computing, quantum computing uses the principles of quantum mechanics to perform general-purpose calculations. Classical computers (digital computers) use 0 and 1 to encode, store, and process data in binary, with each bit taking the value 0 or 1. Quantum computing is based on the manipulation of qubits. Each qubit can be in the superposition of the quantum state |0> state and |1> state. N qubits can be in the superposition state of 2N quantum states (|0...0>, |0...1>,..., |1...0>, |1...1> states) ( like
Figure PCTCN2020077208-appb-000001
(|0...0>+|0...1>+...+|1...0>+|1...1>)), the manipulation of the superposition state is equivalent to the simultaneous action of the operation In these 2N states, quantum computers have powerful quantum parallel computing capabilities.
玻尔兹曼机,玻尔兹曼机是一种神经网络模型。它包含两组变量:隐藏变量和可见变量,所有变量都是二值的(取0或1)。一个具有N个变量的玻尔兹曼机满足以下三个性质:1.所有变量(样本)可以用二值的随机向量x∈{0,1} N表示;2.所有变量之间是全连接的,每个变量的取值依赖于所有其他变量;3.变量之间的影响关系是两两对称的。变量X的联合概率符合玻尔兹曼分布,P(x)=(1/Z)exp(-E(x)),其中Z为配分函数Z=Σ xexp(-E(x)),能量函数E(x)=-(Σ i<jw ijx ix jib ix i),其中w ij是两个变量x i和x j之间的连接权重,x i∈{0,1}表示变量的状态,b i是变量x i的偏置。玻尔兹曼机参数训练采用的损失函数是负对数似然
Figure PCTCN2020077208-appb-000002
其中v表示可见变量,
Figure PCTCN2020077208-appb-000003
表示训练数据集data中v的实际概率,P v是模型中可见变量的边际概率P v=(1/Z)Σ hexp(-E(x)),其中h表示隐藏变量。玻尔兹曼机的参数更新公式通常无法精确计算,而需要通过吉布斯采样的方法近似计算。
Boltzmann machine, Boltzmann machine is a neural network model. It contains two sets of variables: hidden variables and visible variables, all variables are binary (take 0 or 1). A Boltzmann machine with N variables satisfies the following three properties: 1. All variables (samples) can be represented by a binary random vector x∈{0,1} N ; 2. All variables are fully connected Yes, the value of each variable depends on all other variables; 3. The influence relationship between the variables is pairwise symmetrical. The joint probability of the variable X conforms to the Boltzmann distribution, P(x)=(1/Z)exp(-E(x)), where Z is the partition function Z=Σ x exp(-E(x)), energy Function E(x)=-(Σ i<j w ij x i x ji b i x i ), where w ij is the connection weight between two variables x i and x j , x i ∈ {0 , 1} denotes a state variable, is a variable x i B i is the offset. The loss function used in Boltzmann machine parameter training is negative log likelihood
Figure PCTCN2020077208-appb-000002
Where v represents a visible variable,
Figure PCTCN2020077208-appb-000003
Represents the actual probability of v in the training data set data, P v is the marginal probability of the visible variable in the model P v =(1/Z)Σ h exp(-E(x)), where h represents the hidden variable. The parameter update formula of the Boltzmann machine is usually not accurately calculated, and it needs to be approximated by the Gibbs sampling method.
量子玻尔兹曼机,可以看做是经典玻尔兹曼机的量子版本。量子玻尔兹曼机中的 变量是qubit。经典玻尔兹曼机中的能量函数则由量子力学中的概念哈密顿量(hamiltonian)代替。哈密顿量是量子力学算符,可以用矩阵表示,对于N个qubit的体系,其哈密顿量是维度为2 N×2 N的矩阵。哈密顿量的本征值是能量,因此若哈密顿量只存在对角线元素时(例如
Figure PCTCN2020077208-appb-000004
其中
Figure PCTCN2020077208-appb-000005
也是维度为2 N×2 N的矩阵,w ij和b i是模型参数),N个qubit的量子玻尔兹曼机与N个变量的经典玻尔兹曼机是等价的;当哈密顿量存在非对角元素时(例如
Figure PCTCN2020077208-appb-000006
其中
Figure PCTCN2020077208-appb-000007
都是维度为2 N×2 N的矩阵,
Figure PCTCN2020077208-appb-000008
是模型参数),经典玻尔兹曼机的能量函数则不能描述哈密顿量的全部特征,量子玻尔兹曼机相比经典玻尔兹曼机可以描述更复杂的模型。与经典玻尔兹曼机类似,量子玻尔兹曼机的N个qubit的状态x的联合概率满足玻尔兹曼分布P(x)=(1/Z)exp(-<x|H|x>),其中Z=tr[exp(-H)]是配分函数,|x>是状态x的量子态,以列向量表示,<x|是|x>的共轭转置,以行向量表示。量子玻尔兹曼机的参数训练所采用的损失函数也是负对数似然
Figure PCTCN2020077208-appb-000009
边际概率P v=(1/Z)tr[Λ vexp(-H)],其中tr()表示矩阵的迹,
Figure PCTCN2020077208-appb-000010
h表示隐藏变量,I表示单位矩阵。量子玻尔兹曼机的参数更新,可以通过量子计算机得到模型的玻尔兹曼分布,并进行采样,而后计算参数更新值。一些研究表明,量子玻尔兹曼机隐藏变量的增加对其性能提高影响有限,通过增加哈密顿量中参数的自由度对其性能的提升更为显著。量子玻尔兹曼机的模拟和实验表明其可以更快地完成训练和/或得到更好的模型。
The quantum Boltzmann machine can be regarded as a quantum version of the classic Boltzmann machine. The variable in the quantum Boltzmann machine is qubit. The energy function in the classical Boltzmann machine is replaced by the conceptual hamiltonian in quantum mechanics. The Hamiltonian is a quantum mechanics operator, which can be represented by a matrix. For a system of N qubits, the Hamiltonian is a matrix with a dimension of 2 N × 2 N. The eigenvalue of the Hamiltonian is energy, so if the Hamiltonian only has diagonal elements (e.g.
Figure PCTCN2020077208-appb-000004
in
Figure PCTCN2020077208-appb-000005
Is of dimension 2 N × 2 N matrix, w ij and b i are model parameters), Boltzmann machine of N classical quantum qubit Boltzmann machine and the N variables are equivalent; when Hamiltonian When non-diagonal elements exist (e.g.
Figure PCTCN2020077208-appb-000006
in
Figure PCTCN2020077208-appb-000007
Are all matrices with dimensions of 2 N × 2 N,
Figure PCTCN2020077208-appb-000008
Are model parameters), the energy function of the classical Boltzmann machine cannot describe all the characteristics of the Hamiltonian, and the quantum Boltzmann machine can describe more complex models than the classical Boltzmann machine. Similar to the classical Boltzmann machine, the joint probability of the N qubit states x of the quantum Boltzmann machine satisfies the Boltzmann distribution P(x)=(1/Z)exp(-<x|H|x >), where Z=tr[exp(-H)] is the partition function, |x> is the quantum state of state x, expressed as a column vector, <x| is the conjugate transpose of |x>, expressed as a row vector . The loss function used in the parameter training of the quantum Boltzmann machine is also negative log likelihood
Figure PCTCN2020077208-appb-000009
Marginal probability P v =(1/Z)tr[Λ v exp(-H)], where tr() represents the trace of the matrix,
Figure PCTCN2020077208-appb-000010
h represents a hidden variable, and I represents the identity matrix. To update the parameters of the quantum Boltzmann machine, the Boltzmann distribution of the model can be obtained by a quantum computer, and sampled, and then the parameter update value is calculated. Some studies have shown that the increase of hidden variables of the quantum Boltzmann machine has a limited impact on its performance improvement, and its performance improvement is more significant by increasing the degree of freedom of the parameters in the Hamiltonian. Simulations and experiments of the quantum Boltzmann machine show that it can complete training faster and/or get a better model.
量子近似优化算法,量子近似优化算法(quantum approximate optimization algorithm,QAOA)是一种量子算法,具体地说它是结合了经典参数优化与量子计算的量子-经典混合算法。我们可以用QAOA得到量子玻尔兹曼机的玻尔兹曼分布,进而进行采样和计算。QAOA涉及两个算符,分别称作混合哈密顿量H M与代价哈密顿量H C。QAOA具体包括如下步骤:首先制备一个量子态,它的密度算符为exp(-βH M)/tr[exp(-βH M)],其中β是一个常数;然后对该量子态执行算符
Figure PCTCN2020077208-appb-000011
其中v l和γ l是一系列待优化的常数且被赋予随机的初始值;然后测量算符H C的平均值<H C>,这是一个数值;通过经典计算机与经典优化方法如梯度下降法等,优化v l和γ l直到获得<H C>的最小值,此时v l和γ l分别取值
Figure PCTCN2020077208-appb-000012
Figure PCTCN2020077208-appb-000013
那么再将算符
Figure PCTCN2020077208-appb-000014
作用于量子态exp(-βH M)/tr[exp(-βH M)],将得到密度算符为exp(-βH C)/tr[exp(-βH C)]的量子态。
Quantum approximate optimization algorithm, quantum approximate optimization algorithm (QAOA) is a quantum algorithm, specifically it is a quantum-classical hybrid algorithm that combines classical parameter optimization and quantum computing. We can use QAOA to obtain the Boltzmann distribution of the quantum Boltzmann machine, and then perform sampling and calculations. QAOA involves two operators, called the mixed Hamiltonian H M and the cost Hamiltonian H C. QAOA specifically includes the following steps: first prepare a quantum state, its density operator is exp(-βH M )/tr[exp(-βH M )], where β is a constant; then perform the operator on the quantum state
Figure PCTCN2020077208-appb-000011
Where v l and γ l are a series of constants to be optimized and are given random initial values; then the average value of the operator H C is measured <H C >, which is a numerical value; through classical computers and classical optimization methods such as gradient descent Method, etc., optimize v l and γ l until the minimum value of <H C > is obtained, at this time v l and γ l take values respectively
Figure PCTCN2020077208-appb-000012
with
Figure PCTCN2020077208-appb-000013
Then the operator
Figure PCTCN2020077208-appb-000014
Acting on the quantum state exp(-βH M )/tr[exp(-βH M )] will get the quantum state with the density operator exp(-βH C )/tr[exp(-βH C )].
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项 (个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。另外,本申请的实施例采用了“第一”、“第二”等字样对名称或功能或作用类似的对象进行区分,本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定。In this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. "The following at least one item (item)" or similar expressions refers to any combination of these items, including any combination of single item (item) or plural items (item). For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple . In addition, the embodiments of the present application use words such as "first" and "second" to distinguish objects with similar names or functions or functions. Those skilled in the art can understand the words "first", "second" and the like. There is no limit to the number and execution order.
如图1所示,本申请的实施例提供一种混合计算机01,包括计算子系统20、耦合至计算子系统20的数字计算机10。计算子系统20可以提供专业功能。在一些实施方式中,在本申请提供的实施例中,计算子系统20是量子计算机,并且数字计算机10是经典计算机。在一些实施方式中,量子计算机是量子退火器和/或绝热量子计算机。在一些实施方式中,量子计算机是门级模型(gate-model)量子计算机或另一适合类型的量子计算机。As shown in FIG. 1, an embodiment of the present application provides a hybrid computer 01, which includes a computing subsystem 20 and a digital computer 10 coupled to the computing subsystem 20. The computing subsystem 20 can provide professional functions. In some embodiments, in the examples provided in this application, the computing subsystem 20 is a quantum computer, and the digital computer 10 is a classical computer. In some embodiments, the quantum computer is a quantum annealing and/or adiabatic quantum computer. In some embodiments, the quantum computer is a gate-model quantum computer or another suitable type of quantum computer.
数字计算机10包括一个或多个数字处理器101,通信线路102,以及至少一个通信接口(图1中仅是示例性的以包括通信接口104,以及一个数字处理器101为例进行说明),可选的还可以包括存储器103。The digital computer 10 includes one or more digital processors 101, a communication line 102, and at least one communication interface (in FIG. 1 it is only an example that includes a communication interface 104 and a digital processor 101 as an example). Optionally, the memory 103 may also be included.
数字处理器101可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。The digital processor 101 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. Integrated circuits.
通信线路102可包括一通路,用于连接不同组件之间。The communication line 102 may include a path for connecting different components.
通信接口104,可以是收发模块用于与其他设备或通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。例如,所述收发模块可以是收发器、收发机一类的装置。可选的,所述通信接口104也可以是位于数字处理器101内的收发电路,用以实现处理器的信号输入和信号输出。The communication interface 104 may be a transceiver module for communicating with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc. For example, the transceiver module may be a device such as a transceiver or a transceiver. Optionally, the communication interface 104 may also be a transceiver circuit located in the digital processor 101 to implement signal input and signal output of the processor.
存储器103可以是具有存储功能的装置。例如可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路102与处理器相连接。存储器也可以和数字处理器集成在一起。The memory 103 may be a device having a storage function. For example, it can be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions Dynamic storage devices can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage ( Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be stored by a computer Any other media taken, but not limited to this. The memory can exist independently and is connected to the processor through the communication line 102. The memory can also be integrated with the digital processor.
其中,存储器103用于存储执行本申请方案的计算机执行指令,并由数字处理器101来控制执行。数字处理器101用于执行存储器103中存储的计算机执行指令,从而实现本申请实施例中提供的训练方法中量子计算之外的其他经典数字处理计算。通信接口104负责与其他设备通信,本申请实施例对此不作具体限定。Wherein, the memory 103 is used to store computer-executable instructions for executing the solution of the present application, and the digital processor 101 controls the execution. The digital processor 101 is configured to execute computer execution instructions stored in the memory 103, so as to implement other classical digital processing calculations besides quantum calculation in the training method provided in the embodiment of the present application. The communication interface 104 is responsible for communicating with other devices, which is not specifically limited in the embodiment of the present application.
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
在具体实现中,作为一种实施例,数字处理器101可以包括一个或多个CPU,例如图1中的CPU0和CPU1。In a specific implementation, as an embodiment, the digital processor 101 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 1.
在具体实现中,作为一种实施例,数字计算机10可以包括多个数字处理器,例如图1中的数字处理器101和数字处理器108。这些数字处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的数字处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In a specific implementation, as an embodiment, the digital computer 10 may include multiple digital processors, such as the digital processor 101 and the digital processor 108 in FIG. 1. Each of these digital processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor. The digital processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
在具体实现中,作为一种实施例,数字计算机10还可以包括输出设备105和输入设备106。输出设备105和数字处理器101通信,可以以多种方式来显示信息。例如,输出设备105可以是液晶显示器(liquid crystal display,LCD),发光二极管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备106和数字处理器101通信,可以以多种方式接收用户的输入。例如,输入设备106可以是鼠标、键盘、触摸屏设备或传感设备等。上述的数字计算机10可以是一个通用设备或者是一个专用设备。相关领域的技术人员将理解,当被正确配置或编程以形成专用设备时,和/或当被通信地耦合以控制量子计算机时,可以使用其他数字计算机配置来实践本发明系统和方法,包括手持装置、多处理器系统、基于微处理器的或可编程的消费者电子装置、个人计算机(“PC”)、网络PC、迷你计算机、大型计算机等。In a specific implementation, as an embodiment, the digital computer 10 may further include an output device 105 and an input device 106. The output device 105 communicates with the digital processor 101 and can display information in a variety of ways. For example, the output device 105 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector), etc. The input device 106 communicates with the digital processor 101 and can receive user input in a variety of ways. For example, the input device 106 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like. The above-mentioned digital computer 10 may be a general-purpose device or a special-purpose device. Those skilled in the relevant art will understand that when properly configured or programmed to form a dedicated device, and/or when communicatively coupled to control a quantum computer, other digital computer configurations can be used to practice the system and method of the present invention, including handheld Devices, multi-processor systems, microprocessor-based or programmable consumer electronic devices, personal computers ("PCs"), network PCs, minicomputers, mainframe computers, etc.
在本文中,有时将以单数形式引用数字计算机10,但这不旨在将应用限制于单个数字计算机。还可以在分布式计算环境中实践本发明系统和方法,其中,任务或多组指令由通过通信网络链接的远程处理装置进行或执行。在分布式计算环境中,计算机可读或处理器可读指令(有时被称为程序模块)、应用程序和/或数据可以位于本地存储器存储装置和远程存储器存储装置(例如,非暂态计算机可读或处理器可读介质)两者中。如图1所示,数字计算机10通过控制器109耦合计算子系统20,控制器109在数字计算机10中耦合至通信线路102。在一些实施方式中,存储器103可以存储一组计算机可读或处理器可读计算指令(即,计算模块)以对计算子系统20执行预处理、协处理和后处理。根据本发明系统和方法,存储器103可以存储可操作用于与计算子系统20交互的一组模拟计算机或者量子计算机接口模块。In this article, the digital computer 10 will sometimes be referred to in the singular form, but this is not intended to limit the application to a single digital computer. The system and method of the present invention can also be practiced in a distributed computing environment, where tasks or sets of instructions are performed or executed by remote processing devices linked through a communication network. In a distributed computing environment, computer-readable or processor-readable instructions (sometimes referred to as program modules), application programs, and/or data can be located in both local memory storage devices and remote memory storage devices (for example, non-transitory computer storage devices). Read or processor readable media) in both. As shown in FIG. 1, the digital computer 10 is coupled to the computing subsystem 20 through the controller 109, and the controller 109 is coupled to the communication line 102 in the digital computer 10. In some embodiments, the memory 103 may store a set of computer-readable or processor-readable computing instructions (ie, computing modules) to perform pre-processing, co-processing, and post-processing on the computing subsystem 20. According to the system and method of the present invention, the memory 103 can store a set of analog computers or quantum computer interface modules operable to interact with the computing subsystem 20.
一些实施方式中,存储器103可以存储量子玻尔兹曼机训练的相关指令以为作为量子玻尔兹曼机的计算子系统20的操作提供程序和参数。例如,可以在数字计算机10和计算子系统20上实施本申请的实施例提供的量子玻尔兹曼机的训练方法。In some embodiments, the memory 103 may store related instructions for training of the quantum Boltzmann machine to provide programs and parameters for the operation of the computing subsystem 20 as the quantum Boltzmann machine. For example, the training method of the quantum Boltzmann machine provided by the embodiment of the present application can be implemented on the digital computer 10 and the computing subsystem 20.
计算子系统20可以设置在隔离环境(未示出)中。例如,在计算子系统20是量子计算机的情况下,所述环境可以屏蔽量子计算机的内部元件免受热量、磁场等的影响。计算子系统20可以包括量子处理器201。The computing subsystem 20 may be set in an isolated environment (not shown). For example, in the case where the computing subsystem 20 is a quantum computer, the environment can shield the internal components of the quantum computer from heat, magnetic fields, and the like. The computing subsystem 20 may include a quantum processor 201.
量子处理器201包括可编程元件,如量子位、耦合器和其他装置。所述量子位是经由读出控制系统202来读出的。这些结果被馈送到数字计算机10的存储器103。量子位是经由量子位控制系统203来控制的。所述耦合器是经由耦合器控制系统204来控制的。在一些实施例中,量子位控制系统203和耦合器控制系统204用于在量子处理器201上实施如本文所述的量子退火。根据本申请系统和装置的至少一些实施 例,量子处理器可以被设计用于执行门级模型量子计算。可替代地或另外地,量子处理器可以被设计用于执行量子退火和/或绝热量子计算。The quantum processor 201 includes programmable elements such as qubits, couplers, and other devices. The qubits are read through the read control system 202. These results are fed to the memory 103 of the digital computer 10. The qubit is controlled via the qubit control system 203. The coupler is controlled via the coupler control system 204. In some embodiments, the qubit control system 203 and the coupler control system 204 are used to implement quantum annealing as described herein on the quantum processor 201. According to at least some embodiments of the system and apparatus of the present application, the quantum processor may be designed to perform gate-level model quantum calculations. Alternatively or additionally, the quantum processor may be designed to perform quantum annealing and/or adiabatic quantum calculations.
基于上述的混合计算机,本申请的实施例提供一种量子玻尔兹曼机的训练方法,参照图2所示,包括如下步骤:Based on the foregoing hybrid computer, the embodiment of the present application provides a method for training a quantum Boltzmann machine, as shown in FIG. 2, including the following steps:
101、获取量子玻尔兹曼机的第一损失函数。101. Obtain the first loss function of the quantum Boltzmann machine.
其中,量子玻尔兹曼机的模型结构包括包含第一层和第二层;第一层的量子单元用于赋值有标记样本的输入样本,第二层的量子单元用于赋值有标记样本的输出样本;或者第一层的量子单元用于赋值无标记样本的输入样本。第一层的量子单元与第二层的量子单元全连接。结合图3以及图4对量子玻尔兹曼机的模型结构说明如下:其中图3提供了有监督学习的量子玻尔兹曼机的模型结构,图4提供了无监督学习的量子玻尔兹曼机的模型结构。其中,参照图3所示,对于有监督学习的量子玻尔兹曼机的模型结构,第一层的量子单元用于赋值有标记样本的输入样本,第二层的量子单元用于赋值有标记样本的输出样本,第一层以及第二层为可见层;参照图4所示,对于无监督学习的量子玻尔兹曼机的模型结构,第一层的量子单元用于赋值无标记样本的输入样本,第一层为可见层,第二层为隐藏层。因此,如图3所示对于有标记样本的有监督学习,量子玻尔兹曼机的第一层以及第二层是全部是可见层,输入与输出共同作为可见层,没有额外的隐藏层。如图4所示,对于无标记样本的无监督学习,在前面的模型基础上,将输出的变量由可见层改为隐藏层,也不再额外引入隐藏层。如图3和图4所示,在进行有监督学习和无监督学习时的模型结构是一致的,所需要的总的qubit数是一致的。Among them, the model structure of the quantum Boltzmann machine includes a first layer and a second layer; the quantum unit of the first layer is used to assign the input sample of the marked sample, and the quantum unit of the second layer is used to assign the value of the marked sample The output sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample. The quantum unit of the first layer is fully connected with the quantum unit of the second layer. Combining Figure 3 and Figure 4, the model structure of the quantum Boltzmann machine is explained as follows: Figure 3 provides the model structure of the quantum Boltzmann machine for supervised learning, and Figure 4 provides the quantum Boltzmann machine for unsupervised learning The model structure of the man machine. Among them, referring to Figure 3, for the model structure of the quantum Boltzmann machine with supervised learning, the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the labeled sample. The output samples of the sample, the first layer and the second layer are visible layers; referring to Figure 4, for the model structure of the quantum Boltzmann machine for unsupervised learning, the quantum unit of the first layer is used to assign the value of the unlabeled sample Input samples, the first layer is the visible layer, and the second layer is the hidden layer. Therefore, as shown in Figure 3 for the supervised learning of labeled samples, the first and second layers of the quantum Boltzmann machine are all visible layers, and the input and output are both visible layers, and there is no additional hidden layer. As shown in Figure 4, for unsupervised learning of unlabeled samples, based on the previous model, the output variable is changed from the visible layer to the hidden layer, and no additional hidden layer is introduced. As shown in Figure 3 and Figure 4, the model structure when performing supervised learning and unsupervised learning is the same, and the total number of qubits required is the same.
量子玻尔兹曼机的哈密顿量的形式没有限定,如背景介绍中说明的,可以是只存在对角线元素的(例如
Figure PCTCN2020077208-appb-000015
也可以是存在非对角元素的(例如
Figure PCTCN2020077208-appb-000016
The form of the Hamiltonian of the quantum Boltzmann machine is not limited. As explained in the background introduction, it can be that only diagonal elements exist (e.g.
Figure PCTCN2020077208-appb-000015
It can also have non-diagonal elements (e.g.
Figure PCTCN2020077208-appb-000016
第一损失函数=α*第二损失函数+β*第三损失函数,其中第二损失函数由有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然计算获得,第三损失函数由无标记样本的输入样本的边际概率的负对数条件似然计算获得;示例性的,为了计算的方便第二损失函数采用如下方式获取:根据所述有标记样本的输入样本条件下输出样本的条件概率进行负对数条件似然计算,获取有监督学习的损失函数;对有监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第二损失函数。第三损失函数采用如下方式获取:根据无标记样本的输入样本的边际概率进行负对数条件似然计算,获取无监督学习的损失函数;对无监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第三损失函数。The first loss function=α*the second loss function+β*the third loss function, where the second loss function is obtained by calculating the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the third The loss function is obtained by calculating the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; exemplary, for the convenience of calculation, the second loss function is obtained in the following manner: according to the input sample condition of the labeled sample The conditional probability of the output sample is calculated by negative logarithmic conditional likelihood to obtain the loss function of supervised learning; the loss function of supervised learning is converted into the second loss function by using Gordon-Thompson Golden-Thompson inequality. The third loss function is obtained in the following way: according to the marginal probability of the input sample of the unlabeled sample, the negative logarithmic conditional likelihood calculation is performed to obtain the loss function of the unsupervised learning; the loss function of the unsupervised learning is used Gordon-Thompson Golden- The Thompson inequality is converted into the third loss function.
以下对第一损失函数的获取方式说明如下:量子玻尔兹曼机的哈密顿量H的具体形式是不限定的。对于有标记样本,包含输入样本x,输出样本y。The method for obtaining the first loss function is described as follows: The specific form of the Hamiltonian H of the quantum Boltzmann machine is not limited. For labeled samples, include input sample x and output sample y.
量子玻尔兹曼机模型中输入样本x的边际概率为
Figure PCTCN2020077208-appb-000017
输入样本x与输出样本y的联合概率为
Figure PCTCN2020077208-appb-000018
那么在输入样本为x的条件下输出样 本y的条件概率为
Figure PCTCN2020077208-appb-000019
其中,对于总的样本数为N的量子玻尔兹曼机,哈密顿量H是2N×2N的矩阵;
Figure PCTCN2020077208-appb-000020
|>与<|分别是量子力学的狄拉克右矢与左矢符号,设输入样本x与输出样本y分别有n个和N-n个,则|x>与<x|分别表示2 n维的列向量与行向量,I y表示2 N-n维的单位矩阵,
Figure PCTCN2020077208-appb-000021
是张量积符号,Λ x也是2 N×2 N矩阵;同理
Figure PCTCN2020077208-appb-000022
H x=H-lnΛ x。对有标记样本的有监督学习,损失函数是负对数条件似然,
Figure PCTCN2020077208-appb-000023
其中D lab表示有标记样本的数据集,
Figure PCTCN2020077208-appb-000024
表示有标记样本的数据集中x与y的联合概率。对于无标记样本的无监督学习,损失函数是负对数似然,
Figure PCTCN2020077208-appb-000025
其中D unlab表示无标记样本的数据集,
Figure PCTCN2020077208-appb-000026
表示无标记样本的数据集中x的概率。上述两个似然函数不便于后续计算和处理,因此利用Golden-Thompson不等式,取
Figure PCTCN2020077208-appb-000027
作为有监督学习的损失函数(即第二损失函数),取
Figure PCTCN2020077208-appb-000028
作为无监督学习的损失函数(即第三损失函数),其中H x,y=H-lnΛ x-lnΛ y。半监督学习的整体损失函数(即第一损失函数)为上述两个损失函数以一定比例相加而得,即
Figure PCTCN2020077208-appb-000029
其中α、β是一个不限定的常数,一种示例是α∈[0,1],β∈[0,1],通常它需要根据样本数据集的特征来决定取值,例如α取0,β不为0时,作为无监督学习的损失函数,α不为0,β取0时,作为有监督学习的损失函数;α、β均不为0时,作为半监督学习的损失函数。
The marginal probability of the input sample x in the quantum Boltzmann machine model is
Figure PCTCN2020077208-appb-000017
The joint probability of input sample x and output sample y is
Figure PCTCN2020077208-appb-000018
Then the conditional probability of output sample y under the condition that the input sample is x is
Figure PCTCN2020077208-appb-000019
Among them, for a quantum Boltzmann machine with a total number of samples N, the Hamiltonian H is a matrix of 2N×2N;
Figure PCTCN2020077208-appb-000020
|> and <| are the right and left symbols of Dirac in quantum mechanics, respectively. Suppose there are n and Nn input samples x and output samples y, respectively, then |x> and <x| respectively represent 2 n -dimensional columns Vector and row vector, I y represents a 2 Nn -dimensional identity matrix,
Figure PCTCN2020077208-appb-000021
Is a tensor product symbol, and Λ x is also a 2 N × 2 N matrix; the same is true
Figure PCTCN2020077208-appb-000022
H x =H-lnΛ x . For supervised learning with labeled samples, the loss function is negative logarithmic conditional likelihood,
Figure PCTCN2020077208-appb-000023
Where D lab represents a data set with labeled samples,
Figure PCTCN2020077208-appb-000024
Represents the joint probability of x and y in the data set with labeled samples. For unsupervised learning of unlabeled samples, the loss function is negative log likelihood,
Figure PCTCN2020077208-appb-000025
Where D unlab represents a data set of unlabeled samples,
Figure PCTCN2020077208-appb-000026
Represents the probability of x in the data set of unlabeled samples. The above two likelihood functions are not convenient for subsequent calculation and processing, so the Golden-Thompson inequality is used to take
Figure PCTCN2020077208-appb-000027
As the loss function of supervised learning (that is, the second loss function), take
Figure PCTCN2020077208-appb-000028
As the loss function of unsupervised learning (that is, the third loss function), where H x,y =H-lnΛ x -lnΛ y . The overall loss function of semi-supervised learning (that is, the first loss function) is obtained by adding the above two loss functions in a certain proportion, namely
Figure PCTCN2020077208-appb-000029
Among them, α and β are unrestricted constants. An example is α ∈ [0, 1], β ∈ [0, 1]. Usually, it needs to determine the value according to the characteristics of the sample data set, for example, α is 0. When β is not 0, it is used as the loss function of unsupervised learning. When α is not 0, when β is 0, it is used as the loss function of supervised learning. When both α and β are not 0, it is used as the loss function of semi-supervised learning.
102、获取第一损失函数对量子玻尔兹曼机的哈密顿量的预定参数的第一偏导数,预定参数包括量子玻尔兹曼机中两个量子单元的连接权重或者所述量子单元的偏置。102. Obtain the first partial derivative of the first loss function with respect to a predetermined parameter of the quantum Boltzmann machine's Hamiltonian, where the predetermined parameter includes the connection weight of two quantum units in the quantum Boltzmann machine or the weight of the quantum unit Bias.
以θ表示量子玻尔兹曼机的哈密顿量中的任意参数(例如
Figure PCTCN2020077208-appb-000030
中的w ij与b i),有
Figure PCTCN2020077208-appb-000031
Figure PCTCN2020077208-appb-000032
Let θ represent any parameter in the Hamiltonian of the quantum Boltzmann machine (e.g.
Figure PCTCN2020077208-appb-000030
In w ij and b i ), there are
Figure PCTCN2020077208-appb-000031
and
Figure PCTCN2020077208-appb-000032
那么
Figure PCTCN2020077208-appb-000033
并可以写作多项式:
Figure PCTCN2020077208-appb-000034
其中,多项式中包括如下四项:
So
Figure PCTCN2020077208-appb-000033
And can be written as a polynomial:
Figure PCTCN2020077208-appb-000034
Among them, the polynomial includes the following four terms:
Figure PCTCN2020077208-appb-000035
Figure PCTCN2020077208-appb-000035
Figure PCTCN2020077208-appb-000036
的四项分别通过量子计算机与经典计算机的混合计算机进行计算。具体提供一种第一偏导数中各项的计算方法:
Figure PCTCN2020077208-appb-000036
The four items of are respectively calculated by a hybrid computer of a quantum computer and a classical computer. Specifically, a method for calculating each item in the first partial derivative is provided:
S01、从样本数据集中确定预定样本,预定样本包括有标记样本或无标记样本。S01. Determine a predetermined sample from a sample data set, where the predetermined sample includes a labeled sample or an unlabeled sample.
S02、制备预定样本的第一量子态;S02. Prepare the first quantum state of the predetermined sample;
S03、对第一量子态执行量子近似优化QAOA算法,获得第二量子态;S03. Perform quantum approximation optimization QAOA algorithm on the first quantum state to obtain the second quantum state;
S04、对第二量子态测量哈密顿量对预定参数的第二偏导数作为第一偏导数的项。S04. Measure the second partial derivative of the Hamiltonian with respect to the predetermined parameter for the second quantum state as the term of the first partial derivative.
为了提高计算精度,还包括S05、对预定样本获取的M次第二偏导数求第一平均值,将所述第一平均值作为第一偏导数的项。其中M取值越大,计算精度越高。In order to improve the calculation accuracy, it also includes S05, calculating the first average value of the M second partial derivatives obtained from the predetermined sample, and using the first average value as the term of the first partial derivative. The larger the value of M, the higher the calculation accuracy.
此外还需要对样本数据集中所有样本进行计算,则还包括:S06对样本数据集中获取的N个样本对应的第二偏导数求第二平均值,将第二平均值作为所述第一偏导数的项。In addition, it is necessary to calculate all samples in the sample data set, which also includes: S06 calculates the second average value of the second partial derivatives corresponding to the N samples obtained in the sample data set, and uses the second average value as the first partial derivative Item.
其中,上述步骤S01可以由数字计算机计算获得,S02-S04可由量子计算机计算获得;在步骤S05中,对预定样本获取的M次第二偏导数求第一平均值可以由数字计算机计算获得,而步骤S05中需要对每次获取的预定样本重复执行步骤S02-S04的步骤。另外在步骤S06中,对样本数据集中获取的N个样本对应的第二偏导数求第二平均值可以由数字计算机计算获得,而步骤S06中需要对每个样本重复执行步骤S02-S04的步骤或者重复执行步骤S02-S05的步骤。Among them, the above step S01 can be calculated by a digital computer, and S02-S04 can be calculated by a quantum computer; in step S05, the first average value of M second partial derivatives obtained by a predetermined sample can be calculated by a digital computer, and In step S05, it is necessary to repeat the steps S02-S04 for each predetermined sample obtained. In addition, in step S06, the second average of the second partial derivatives corresponding to the N samples obtained in the sample data set can be calculated by a digital computer, and in step S06, steps S02-S04 need to be repeated for each sample. Or repeat steps S02-S05.
以下,具体对上述
Figure PCTCN2020077208-appb-000037
的四项的求取说明如下:
Below, specifically for the above
Figure PCTCN2020077208-appb-000037
The description of obtaining the four items of is as follows:
1)对于
Figure PCTCN2020077208-appb-000038
的获取过程说明如下:
1) For
Figure PCTCN2020077208-appb-000038
The description of the acquisition process is as follows:
S1、从有标记样本的数据集中选取一个样本(x i,y i),并由此确定
Figure PCTCN2020077208-appb-000039
的形式,
Figure PCTCN2020077208-appb-000040
S1. Select a sample (x i , y i ) from the data set of labeled samples, and determine from this
Figure PCTCN2020077208-appb-000039
form,
Figure PCTCN2020077208-appb-000040
S2、在QAOA的执行前,制备初始态时,样本y i对应的qubit的量子态按照密度矩阵为exp(-βH M)/tr[exp(-βH M)]进行制备,而样本x i对应的qubit的量子态则根据S1中的样本制备为量子态|x i>。 S2. Before the execution of QAOA, when preparing the initial state, the quantum state of the qubit corresponding to the sample y i is prepared according to the density matrix exp(-βH M )/tr[exp(-βH M )], and the sample x i corresponds to The quantum state of the qubit is prepared as a quantum state |x i > according to the sample in S1.
S3、定义QAOA算法中的代价哈密顿量设置为S1中的
Figure PCTCN2020077208-appb-000041
混合哈密顿量为
Figure PCTCN2020077208-appb-000042
其中
Figure PCTCN2020077208-appb-000043
是维度为2 N×2 N的矩阵,N是输入样本x与输出样本y的总个数,也是执行QAOA算法所需的qubit总数,n是输入样本y的个数,
Figure PCTCN2020077208-appb-000044
Figure PCTCN2020077208-appb-000045
然后按照QAOA算法的剩余步骤完成QAOA算法的执行。QAOA完成后将得到量子态
Figure PCTCN2020077208-appb-000046
对该量子态做算符
Figure PCTCN2020077208-appb-000047
的测量,得到
Figure PCTCN2020077208-appb-000048
S3. Define the cost Hamiltonian in the QAOA algorithm and set it to the value in S1
Figure PCTCN2020077208-appb-000041
The mixed Hamiltonian is
Figure PCTCN2020077208-appb-000042
in
Figure PCTCN2020077208-appb-000043
Is a matrix with a dimension of 2 N × 2 N , N is the total number of input samples x and output samples y, and is also the total number of qubits required to execute the QAOA algorithm, n is the number of input samples y,
Figure PCTCN2020077208-appb-000044
Figure PCTCN2020077208-appb-000045
Then follow the remaining steps of the QAOA algorithm to complete the execution of the QAOA algorithm. After QAOA is completed, the quantum state will be obtained
Figure PCTCN2020077208-appb-000046
Operator to this quantum state
Figure PCTCN2020077208-appb-000047
Measurement, get
Figure PCTCN2020077208-appb-000048
S4、重复S2、S3,对样本(x iyi)得到多次测量的结果,然后计算得到
Figure PCTCN2020077208-appb-000049
M是步骤S2、S3重复的次数,M的取值没有限定,M越大计算精度越高。
S4, repeat S2 and S3, get the results of multiple measurements on the sample (x i , yi ), and then calculate
Figure PCTCN2020077208-appb-000049
M is the number of repetitions of steps S2 and S3. The value of M is not limited. The larger M, the higher the calculation accuracy.
S5、重复步骤S1-S4,在S1中每次从有标记样本的数据集中获取不同的样本,直 至获得所有的有标记样本的
Figure PCTCN2020077208-appb-000050
然后计算得到
Figure PCTCN2020077208-appb-000051
N lab是有标记样本的数量,也是步骤S1-S4重复的次数。
S5. Repeat steps S1-S4. In S1, each time a different sample is obtained from the data set of labeled samples, until all the labeled samples are obtained.
Figure PCTCN2020077208-appb-000050
Then calculated
Figure PCTCN2020077208-appb-000051
N lab is the number of labeled samples and the number of repetitions of steps S1-S4.
2)
Figure PCTCN2020077208-appb-000052
的获取过程说明如下:
2)
Figure PCTCN2020077208-appb-000052
The description of the acquisition process is as follows:
S1、从无标记样本数据集中选取一个样本x i,并由此确定
Figure PCTCN2020077208-appb-000053
的形式,
Figure PCTCN2020077208-appb-000054
Figure PCTCN2020077208-appb-000055
对于无监督学习,输出样本y代表隐藏层变量。
S1. Select a sample x i from the unlabeled sample data set, and determine from this
Figure PCTCN2020077208-appb-000053
form,
Figure PCTCN2020077208-appb-000054
Figure PCTCN2020077208-appb-000055
For unsupervised learning, the output sample y represents the hidden layer variable.
S2、在QAOA的执行前,制备初始态时,样本x i对应的qubit的量子态则根据S1中的样本制备为量子态|x i>。样本y对应的qubit的量子态按照密度矩阵为exp(-βH M)/tr[exp(-βH M)]进行制备。 S2. Before the execution of QAOA, when preparing the initial state, the quantum state of the qubit corresponding to the sample x i is prepared as a quantum state |x i > according to the sample in S1. The quantum state of the qubit corresponding to the sample y is prepared according to the density matrix exp(-βH M )/tr[exp(-βH M )].
S3、定义QAOA算法中的代价哈密顿量设置为S1中的
Figure PCTCN2020077208-appb-000056
混合哈密顿量为
Figure PCTCN2020077208-appb-000057
而后按照QAOA的剩余步骤完成QAOA算法,并在完成QAOA算法得到量子态
Figure PCTCN2020077208-appb-000058
对该量子态做算符
Figure PCTCN2020077208-appb-000059
的测量,进行
Figure PCTCN2020077208-appb-000060
测量得到
Figure PCTCN2020077208-appb-000061
S3. Define the cost Hamiltonian in the QAOA algorithm and set it to the value in S1
Figure PCTCN2020077208-appb-000056
The mixed Hamiltonian is
Figure PCTCN2020077208-appb-000057
Then follow the remaining steps of QAOA to complete the QAOA algorithm, and after completing the QAOA algorithm to get the quantum state
Figure PCTCN2020077208-appb-000058
Operator to this quantum state
Figure PCTCN2020077208-appb-000059
Measurement of
Figure PCTCN2020077208-appb-000060
Measured
Figure PCTCN2020077208-appb-000061
S4、重复S2、S3,对样本x i得到多次测量的结果,然后计算得到多次重复执行QAOA算法和测量,计算得到
Figure PCTCN2020077208-appb-000062
S4. Repeat S2 and S3, get the results of multiple measurements on the sample x i , and then calculate the results of repeated QAOA algorithm and measurement multiple times, and calculate
Figure PCTCN2020077208-appb-000062
S5、重复步骤S1-S4,在S1中每次从无标记样本的数据集中获取不同的样本,直至获得所有的无标记样本的
Figure PCTCN2020077208-appb-000063
然后计算得到
Figure PCTCN2020077208-appb-000064
N unlab是无标记样本的数量,也是步骤S1-S4重复的次数。
S5. Repeat steps S1-S4. In S1, each time a different sample is obtained from the data set of unlabeled samples, until all unlabeled samples are obtained.
Figure PCTCN2020077208-appb-000063
Then calculated
Figure PCTCN2020077208-appb-000064
N unlab is the number of unlabeled samples and the number of repetitions of steps S1-S4.
3)
Figure PCTCN2020077208-appb-000065
的获取过程说明如下:
3)
Figure PCTCN2020077208-appb-000065
The description of the acquisition process is as follows:
S1、执行QAOA算法,其中代价哈密顿量设置为H,混合哈密顿量为
Figure PCTCN2020077208-appb-000066
QAOA算法完成后,对得到的量子态做算符
Figure PCTCN2020077208-appb-000067
的测量,得到
Figure PCTCN2020077208-appb-000068
S1, execute the QAOA algorithm, where the cost Hamiltonian is set to H, and the mixed Hamiltonian is
Figure PCTCN2020077208-appb-000066
After the QAOA algorithm is completed, perform an operator on the obtained quantum state
Figure PCTCN2020077208-appb-000067
Measurement, get
Figure PCTCN2020077208-appb-000068
S2、重复步骤S1,得到多次测量的结果,然后计算得到
Figure PCTCN2020077208-appb-000069
S2, repeat step S1, get the results of multiple measurements, and then calculate
Figure PCTCN2020077208-appb-000069
需要说明的是,通常指定QAOA算法中的代价哈密顿量与混合哈密顿量,那么就可以执行QAOA算法,在执行QAOA算法时,首先要将样本制备到与混合哈密量相关的量子态上。但是计算
Figure PCTCN2020077208-appb-000070
中,样本x的量子态被预先制备为|x i>,而不是根据QAOA算法的混合哈密顿量而制备,样本y则按照QAOA算法的标准程序进行初始态制备。而计算
Figure PCTCN2020077208-appb-000071
第四项时,由于第四项与样本无关,因此可以完全根据QAOA算法的一般规则,在执行开始时将所有qubit制备到与混合哈密顿量相关的量子态,然后继续执行QAOA算法,因此在
Figure PCTCN2020077208-appb-000072
的获取过程未对样本选取和量子态制备做详细说明。
It should be noted that if the cost Hamiltonian and the mixed Hamiltonian in the QAOA algorithm are usually specified, then the QAOA algorithm can be executed. When the QAOA algorithm is executed, the sample must first be prepared to a quantum state related to the mixed Hamiltonian. But calculate
Figure PCTCN2020077208-appb-000070
In, the quantum state of sample x is prepared in advance as |x i >, instead of preparing according to the mixed Hamiltonian of the QAOA algorithm, the sample y is prepared according to the standard procedure of the QAOA algorithm for the initial state. While calculating
Figure PCTCN2020077208-appb-000071
In the fourth item, since the fourth item has nothing to do with the sample, it is possible to prepare all qubits to a quantum state related to the mixed Hamiltonian at the beginning of the execution according to the general rules of the QAOA algorithm, and then continue to execute the QAOA algorithm.
Figure PCTCN2020077208-appb-000072
The acquisition process did not elaborate on sample selection and quantum state preparation.
4)
Figure PCTCN2020077208-appb-000073
的获取过程如下:
4)
Figure PCTCN2020077208-appb-000073
The acquisition process is as follows:
S1、对每一个有标记样本计算
Figure PCTCN2020077208-appb-000074
将所有的有标记样本的结果取均值得到
Figure PCTCN2020077208-appb-000075
S1, calculate for each labeled sample
Figure PCTCN2020077208-appb-000074
Take the average of the results of all labeled samples to get
Figure PCTCN2020077208-appb-000075
其中,
Figure PCTCN2020077208-appb-000076
的获取过程可以完全在数字计算机上实现。
in,
Figure PCTCN2020077208-appb-000076
The acquisition process can be completely implemented on a digital computer.
最终根据上述1)、2)、3)及4)得到的每一项的计算结果由数字计算机计算得到
Figure PCTCN2020077208-appb-000077
Finally, the calculation results of each item obtained according to the above 1), 2), 3) and 4) are calculated by a digital computer
Figure PCTCN2020077208-appb-000077
103、对第一偏导数执行梯度算法更新预定参数,获取更新的量子玻尔兹曼机,其中,更新的量子玻尔兹曼机的哈密顿量使用更新后的预定参数。103. Perform a gradient algorithm on the first partial derivative to update a predetermined parameter to obtain an updated quantum Boltzmann machine, where the Hamiltonian of the updated quantum Boltzmann machine uses the updated predetermined parameter.
在步骤103中对第一偏导数具体应用梯度下降法或上升法更新预定参数,完成模型训练。In step 103, the gradient descent method or the ascending method is specifically applied to the first partial derivative to update the predetermined parameters, and the model training is completed.
在上述方案中,由于量子玻尔兹曼机的模型结构包括第一层与第二层;其中,第一层的量子单元用于赋值有标记样本的输入样本,第二层的量子单元用于赋值有标记样本的输出样本,或者第一层用于赋值无标记样本的输入样本;第一层的量子单元与第二层的量子单元全连接;在进行有监督学习和无监督学习时的模型结构是一致的,所需要的总的qubit数是一致的。并且,训练量子玻尔兹曼机的损失函数采用有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然、以及无标记样本的输入样本的边际概率的负对数条件似然按照一定比例相加获取,使得训练获得的量子玻尔兹曼机能够适应半监督学习。In the above solution, the model structure of the quantum Boltzmann machine includes a first layer and a second layer; among them, the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used for Output samples assigned with labeled samples, or input samples used to assign unlabeled samples in the first layer; quantum units in the first layer are fully connected with quantum units in the second layer; models for supervised and unsupervised learning The structure is the same, and the total number of qubits required is the same. In addition, the loss function of the training quantum Boltzmann machine adopts the negative logarithmic conditional likelihood of the conditional probability of the output sample under the condition of the input sample of the labeled sample, and the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample. However, the quantum Boltzmann machine obtained by training can be adapted to semi-supervised learning according to a certain ratio.
可以理解的是,以上各个实施例中,由混合计算机实现的方法和/或步骤,也可以由可用于混合计算机的部件(例如芯片或者电路)实现。It can be understood that, in the above embodiments, the methods and/or steps implemented by the hybrid computer can also be implemented by components (such as chips or circuits) that can be used in the hybrid computer.
上述主要从混合计算机实施的方法流程的角度对本申请实施例提供的方案进行了介绍。相应的,本申请实施例还提供了混合计算机,该混合计算机用于实现上述各种方法。可以理解的是,该混合计算机为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solution provided by the embodiment of the present application from the perspective of the method flow implemented by the hybrid computer. Correspondingly, an embodiment of the present application also provides a hybrid computer, which is used to implement the above-mentioned various methods. It can be understood that, in order to realize the above-mentioned functions, the hybrid computer includes hardware structures and/or software modules corresponding to each function. Those skilled in the art should easily realize that in combination with the units and algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法实施例中对混合计算机进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiments of the present application may divide the functional modules of the hybrid computer according to the foregoing method embodiments. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
图5示出了一种混合计算机5的结构示意图。该混合计算机包括:数字计算单元51和量子计算单元52。FIG. 5 shows a schematic diagram of the structure of a hybrid computer 5. The hybrid computer includes: a digital computing unit 51 and a quantum computing unit 52.
数字计算单元51,用于获取量子玻尔兹曼机的第一损失函数,其中所述量子玻尔兹曼机的模型结构包括包含第一层和第二层;所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本;或者所述第一层的量子单元用于赋值无标记样本的输入样本;所述第一层的量子单元与所述第 二层的量子单元全连接;所述第一损失函数=α*第二损失函数+β*第三损失函数,其中所述第二损失函数由所述有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然计算获得,所述第三损失函数由所述无标记样本的输入样本的边际概率的负对数条件似然计算获得;α和β为常数;量子计算单元52,获取所述数字计算单元51获取的所述第一损失函数对所述量子玻尔兹曼机的模型的哈密顿量的预定参数的第一偏导数,所述预定参数包括所述量子玻尔兹曼机的模型中两个量子单元的连接权重或者所述量子单元的偏置;所述数字计算单元51还用于对所述量子计算单元52获取的所述第一偏导数执行梯度算法更新所述预定参数,获取所述更新的量子玻尔兹曼机,其中所述更新的量子玻尔兹曼机的哈密顿量使用更新后的所述预定参数。The digital computing unit 51 is configured to obtain the first loss function of the quantum Boltzmann machine, wherein the model structure of the quantum Boltzmann machine includes a first layer and a second layer; the quantum unit of the first layer The quantum unit of the second layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the output sample of the labeled sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample; The quantum unit of one layer is fully connected with the quantum unit of the second layer; the first loss function=α*the second loss function+β*the third loss function, wherein the second loss function is marked by the The negative logarithmic conditional likelihood of the conditional probability of the output sample under the input sample condition of the sample is calculated, and the third loss function is calculated by the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; α; And β are constants; the quantum computing unit 52 acquires the first partial derivative of the first loss function acquired by the digital computing unit 51 with respect to the predetermined parameter of the Hamiltonian of the model of the quantum Boltzmann machine, so The predetermined parameter includes the connection weight of the two quantum units in the model of the quantum Boltzmann machine or the bias of the quantum unit; the digital computing unit 51 is also used to compare the quantum computing unit 52 The first partial derivative executes a gradient algorithm to update the predetermined parameter to obtain the updated quantum Boltzmann machine, wherein the Hamiltonian of the updated quantum Boltzmann machine uses the updated predetermined parameter.
可选的,所述数字计算单元51还用于根据所述有标记样本的输入样本条件下输出样本的条件概率进行负对数条件似然计算,获取有监督学习的损失函数;对所述有监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第二损失函数。Optionally, the digital calculation unit 51 is further configured to perform negative logarithmic conditional likelihood calculation according to the conditional probability of the output sample under the condition of the input sample of the labeled sample, and obtain the loss function of supervised learning; The loss function of the supervised learning is converted into the second loss function using the Golden-Thompson Golden-Thompson inequality.
可选的,所述数字计算单元51还用于根据所述无标记样本的输入样本的边际概率进行负对数条件似然计算,获取无监督学习的损失函数;对所述无监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第三损失函数。Optionally, the digital calculation unit 51 is further configured to perform negative log conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning; The function is converted into the third loss function using the Gordon-Thompson Golden-Thompson inequality.
可选的,所述第一偏导数表示为多项式,所述数字计算单元51用于从样本数据集中确定预定样本,所述预定样本包括所述有标记样本或所述无标记样本;所述量子计算单元52用于制备所述数字计算单元确定的所述预定样本的第一量子态;对所述第一量子态执行QAOA算法,获得第二量子态;对所述第二量子态测量哈密顿量对预定参数的第二偏导数作为所述第一偏导数的项。Optionally, the first partial derivative is expressed as a polynomial, and the digital calculation unit 51 is configured to determine a predetermined sample from a sample data set, and the predetermined sample includes the labeled sample or the unlabeled sample; the quantum The calculation unit 52 is configured to prepare the first quantum state of the predetermined sample determined by the digital calculation unit; execute the QAOA algorithm on the first quantum state to obtain the second quantum state; measure the Hamiltonian for the second quantum state The second partial derivative of the quantity with respect to a predetermined parameter is used as the term of the first partial derivative.
可选的,所述数字计算单元51还用于对所述预定样本获取的M次所述第二偏导数求第一平均值,将所述第一平均值作为所述第一偏导数的项。Optionally, the digital calculation unit 51 is further configured to calculate a first average value of the M second partial derivatives obtained by the predetermined sample, and use the first average value as a term of the first partial derivative .
可选的,所述数字计算单元51还用于对在预定样本集合中获取的N个样本对应的第二偏导数求第二平均值,将所述第二平均值作为所述第一偏导数的项。Optionally, the digital calculation unit 51 is further configured to calculate a second average value of the second partial derivatives corresponding to the N samples obtained in a predetermined sample set, and use the second average value as the first partial derivative Item.
可选的,所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本时,所述第一层以及所述第二层为可见层;或者,所述第一层的量子单元用于赋值无标记样本的输入样本时,所述第一层为可见层,所述第二层为隐藏层。Optionally, when the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the output sample of the labeled sample, the first layer and the first layer The second layer is the visible layer; or, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is the visible layer, and the second layer is the hidden layer.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Among them, all relevant content of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
在本实施例中,该混合计算机以采用集成的方式划分各个功能模块的形式来呈现。这里的“模块”可以指特定ASIC,电路,执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。在一个简单的实施例中,本领域的技术人员可以想到该混合计算机可以采用图1所示的混合计算机的形式。In this embodiment, the hybrid computer is presented in the form of dividing various functional modules in an integrated manner. The "module" here can refer to a specific ASIC, circuit, processor and memory that executes one or more software or firmware programs, integrated logic circuit, and/or other devices that can provide the above-mentioned functions. In a simple embodiment, those skilled in the art can imagine that the hybrid computer can take the form of the hybrid computer shown in FIG. 1.
比如,图1所示的混合计算机01中的数字处理器101以及计算子系统20可以通过调用存储器103中存储的计算机执行指令,使得混合计算机01执行上述方法实施例中的方法;计算子系统20可以为量子计算机。For example, the digital processor 101 and the computing subsystem 20 in the hybrid computer 01 shown in FIG. 1 can call the computer execution instructions stored in the memory 103, so that the hybrid computer 01 executes the method in the foregoing method embodiment; the computing subsystem 20 It can be a quantum computer.
具体的,图5中的数字计算单元51的功能/实现过程可以通过图1所示的混合计 算机01中数字计算机实现,即通过数字处理器101调用存储器103中存储的计算机执行指令来实现。量子计算单元52的功能/实现过程可以通过图1所示的混合计算机01中的量子计算机实现,即通过量子计算机调用存储器103中存储的计算机执行指令来实现。由于本实施例提供的混合计算机01可执行上述的方法,因此其所能获得的技术效果可参考上述方法实施例,在此不再赘述。Specifically, the function/implementation process of the digital computing unit 51 in FIG. 5 can be realized by the digital computer in the hybrid computer 01 shown in FIG. The function/implementation process of the quantum computing unit 52 can be implemented by the quantum computer in the hybrid computer 01 shown in FIG. Since the hybrid computer 01 provided in this embodiment can execute the above-mentioned method, the technical effects that can be obtained can refer to the above-mentioned method embodiment, which will not be repeated here.
可选的,本申请实施例还提供了一种混合计算机(例如,该混合计算机可以是芯片或芯片系统),该混合计算机包括处理器和接口,处理器用于读取指令以执行上述任一方法实施例中的方法。在一种可能的设计中,该混合计算机还包括存储器。该存储器,用于保存必要的程序指令和数据,处理器可以调用存储器中存储的程序代码以指令该混合计算机执行上述任一方法实施例中的方法。当然,存储器也可以不在该计算装置中。该混合计算机是芯片系统时,可以由芯片构成,也可以包含芯片和其他分立器件,本申请实施例对此不作具体限定。Optionally, the embodiments of the present application also provide a hybrid computer (for example, the hybrid computer may be a chip or a chip system), the hybrid computer includes a processor and an interface, and the processor is used to read instructions to perform any of the above methods. The method in the embodiment. In one possible design, the hybrid computer also includes memory. The memory is used to store necessary program instructions and data, and the processor can call the program code stored in the memory to instruct the hybrid computer to execute the method in any of the foregoing method embodiments. Of course, the memory may not be in the computing device. When the hybrid computer is a chip system, it may be composed of chips, or may include chips and other discrete devices, which are not specifically limited in the embodiments of the present application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。本申请实施例中,计算机可以包括前面所述的装置。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using a software program, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or includes one or more data storage devices such as servers, data centers, etc. that can be integrated with the medium. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)). In the embodiment of the present application, the computer may include the aforementioned device.
尽管在此结合各实施例对本申请进行了描述,然而,在实施所要求保护的本申请过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。Although this application has been described in conjunction with various embodiments, in the process of implementing the claimed application, those skilled in the art can understand and understand by viewing the drawings, the disclosure, and the appended claims in the process of implementing the claimed application. Implement other changes of the disclosed embodiment. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "one" does not exclude a plurality. A single processor or other unit may implement several functions listed in the claims. Certain measures are described in mutually different dependent claims, but this does not mean that these measures cannot be combined to produce good results.
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Although the application has been described in combination with specific features and embodiments, it is obvious that various modifications and combinations can be made without departing from the spirit and scope of the application. Correspondingly, the specification and drawings are merely exemplary descriptions of the application as defined by the appended claims, and are deemed to cover any and all modifications, changes, combinations or equivalents within the scope of the application. Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of this application fall within the scope of the claims of this application and their equivalent technologies, then this application is also intended to include these modifications and variations.
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围 并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。Finally, it should be noted that the above are only specific implementations of this application, but the scope of protection of this application is not limited to this. Any changes or substitutions within the technical scope disclosed in this application shall be covered by this application. Within the scope of protection applied for. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (17)

  1. 一种量子玻尔兹曼机的训练方法,其特征在于,包括:A training method of a quantum Boltzmann machine is characterized in that it comprises:
    获取量子玻尔兹曼机的第一损失函数,其中所述量子玻尔兹曼机的模型结构包含第一层和第二层;所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本;或者所述第一层的量子单元用于赋值无标记样本的输入样本;所述第一层的量子单元与所述第二层的量子单元全连接;所述第一损失函数=α*第二损失函数+β*第三损失函数,其中所述第二损失函数由所述有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然计算获得,所述第三损失函数由所述无标记样本的输入样本的边际概率的负对数条件似然计算获得;α和β为常数;Obtain the first loss function of the quantum Boltzmann machine, wherein the model structure of the quantum Boltzmann machine includes the first layer and the second layer; the quantum unit of the first layer is used to assign the input of the labeled sample Sample, the quantum unit of the second layer is used to assign the output sample of the labeled sample; or the quantum unit of the first layer is used to assign the input sample of the unlabeled sample; the quantum unit of the first layer and the The quantum unit of the second layer is fully connected; the first loss function=α*the second loss function+β*the third loss function, wherein the second loss function is the output sample under the condition of the input sample of the labeled sample The negative logarithmic conditional likelihood of the conditional probability of is calculated, and the third loss function is calculated by the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; α and β are constants;
    获取所述第一损失函数对所述量子玻尔兹曼机的哈密顿量的预定参数的第一偏导数,所述预定参数包括所述量子玻尔兹曼机中两个量子单元的连接权重或者所述量子单元的偏置;Obtain a first partial derivative of the first loss function to a predetermined parameter of the Hamiltonian of the quantum Boltzmann machine, where the predetermined parameter includes the connection weight of two quantum units in the quantum Boltzmann machine Or the bias of the quantum unit;
    对所述第一偏导数执行梯度算法更新所述预定参数,获取更新的量子玻尔兹曼机,其中,所述更新的量子玻尔兹曼机的哈密顿量使用更新后的所述预定参数。Perform a gradient algorithm on the first partial derivative to update the predetermined parameter to obtain an updated quantum Boltzmann machine, wherein the Hamiltonian of the updated quantum Boltzmann machine uses the updated predetermined parameter .
  2. 根据权利要求1所述的量子玻尔兹曼机的训练方法,其特征在于,所述方法还包括:The training method of a quantum Boltzmann machine according to claim 1, wherein the method further comprises:
    根据所述有标记样本的输入样本条件下输出样本的条件概率进行负对数条件似然计算,获取有监督学习的损失函数;Perform negative logarithmic conditional likelihood calculation according to the conditional probability of the output sample under the condition of the input sample of the labeled sample, and obtain the supervised learning loss function;
    对所述有监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第二损失函数。The loss function of the supervised learning is converted into the second loss function by using Gordon-Thompson Golden-Thompson inequality.
  3. 根据权利要求1所述的量子玻尔兹曼机的训练方法,其特征在于,所述方法还包括:The training method of a quantum Boltzmann machine according to claim 1, wherein the method further comprises:
    根据所述无标记样本的输入样本的边际概率进行负对数条件似然计算,获取无监督学习的损失函数;Performing negative logarithmic conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning;
    对所述无监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第三损失函数。The loss function of the unsupervised learning is converted into the third loss function by using Gordon-Thompson Golden-Thompson inequality.
  4. 根据权利要求1所述的量子玻尔兹曼机的训练方法,其特征在于,所述第一偏导数表示为多项式,所述方法还包括:The method for training a quantum Boltzmann machine according to claim 1, wherein the first partial derivative is expressed as a polynomial, and the method further comprises:
    从样本数据集中确定预定样本,所述预定样本包括所述有标记样本或所述无标记样本;Determining a predetermined sample from a sample data set, where the predetermined sample includes the labeled sample or the unlabeled sample;
    制备所述预定样本的第一量子态;Preparing the first quantum state of the predetermined sample;
    对所述第一量子态执行量子近似优化QAOA算法,获得第二量子态;Performing a quantum approximation optimization QAOA algorithm on the first quantum state to obtain a second quantum state;
    对所述第二量子态测量哈密顿量对预定参数的第二偏导数作为所述第一偏导数的项。The second partial derivative of the Hamiltonian with respect to a predetermined parameter is measured for the second quantum state as a term of the first partial derivative.
  5. 根据权利要求4所述的量子玻尔兹曼机的训练方法,其特征在于,还包括:The training method of a quantum Boltzmann machine according to claim 4, further comprising:
    对所述预定样本获取的M次所述第二偏导数求第一平均值,将所述第一平均值作为所述第一偏导数的项。Calculate the first average value of the M second partial derivatives obtained by the predetermined sample, and use the first average value as the term of the first partial derivative.
  6. 根据权利要求4或5所述的量子玻尔兹曼机的训练方法,其特征在于,还包 括:The training method of a quantum Boltzmann machine according to claim 4 or 5, characterized in that it further comprises:
    对所述样本数据集中获取的N个样本对应的第二偏导数求第二平均值,将所述第二平均值作为所述第一偏导数的项。A second average value is calculated for the second partial derivatives corresponding to the N samples obtained in the sample data set, and the second average value is used as a term of the first partial derivative.
  7. 根据权利要求1所述的量子玻尔兹曼机的训练方法,其特征在于,所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本时,所述第一层以及所述第二层为可见层;The training method of a quantum Boltzmann machine according to claim 1, wherein the quantum unit of the first layer is used for assigning input samples with labeled samples, and the quantum unit of the second layer is used for assigning values When there are output samples of labeled samples, the first layer and the second layer are visible layers;
    或者,所述第一层的量子单元用于赋值无标记样本的输入样本时,所述第一层为可见层,所述第二层为隐藏层。Alternatively, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is a visible layer and the second layer is a hidden layer.
  8. 一种混合计算机,其特征在于,包括:A hybrid computer, characterized in that it includes:
    数字计算机,用于获取量子玻尔兹曼机的第一损失函数,其中所述量子玻尔兹曼机的模型结构包括包含第一层和第二层;所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本;或者所述第一层的量子单元用于赋值无标记样本的输入样本;所述第一层的量子单元与所述第二层的量子单元全连接;所述第一损失函数=α*第二损失函数+β*第三损失函数,其中所述第二损失函数由所述有标记样本的输入样本条件下输出样本的条件概率的负对数条件似然计算获得,所述第三损失函数由所述无标记样本的输入样本的边际概率的负对数条件似然计算获得;α和β为常数;A digital computer for obtaining the first loss function of a quantum Boltzmann machine, wherein the model structure of the quantum Boltzmann machine includes a first layer and a second layer; the quantum unit of the first layer is used for Input samples assigned with labeled samples, the quantum unit of the second layer is used to assign output samples with labeled samples; or the quantum unit of the first layer is used to assign input samples of unlabeled samples; the first layer The quantum unit of the second layer is fully connected with the quantum unit of the second layer; the first loss function=α*the second loss function+β*the third loss function, wherein the second loss function is determined by the labeled sample The negative logarithmic conditional likelihood of the conditional probability of the output sample under the input sample condition is calculated, and the third loss function is calculated by the negative logarithmic conditional likelihood of the marginal probability of the input sample of the unlabeled sample; α and β Is a constant
    量子计算机,获取所述数字计算机获取的所述第一损失函数对所述量子玻尔兹曼机的哈密顿量的预定参数的第一偏导数,所述预定参数包括所述量子玻尔兹曼机中两个量子单元的连接权重或者所述量子单元的偏置;A quantum computer to obtain a first partial derivative of a predetermined parameter of the Hamiltonian of the quantum Boltzmann machine with respect to the first loss function acquired by the digital computer, the predetermined parameter including the quantum Boltzmann The connection weight of the two quantum units in the machine or the bias of the quantum unit;
    所述数字计算机还用于对所述混合计算机获取的所述第一偏导数执行梯度算法更新所述预定参数,获取所述更新的量子玻尔兹曼机,其中所述更新的量子玻尔兹曼机的哈密顿量使用更新后的所述预定参数。The digital computer is further configured to perform a gradient algorithm on the first partial derivative obtained by the hybrid computer to update the predetermined parameters, and obtain the updated quantum Boltzmann machine, wherein the updated quantum Boltzmann machine The Hamiltonian of the Manchester machine uses the updated predetermined parameter.
  9. 根据权利要求8所述的混合计算机,其特征在于,所述数字计算机还用于根据所述有标记样本的输入样本条件下输出样本的条件概率进行负对数条件似然计算,获取有监督学习的损失函数;对所述有监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第二损失函数。The hybrid computer according to claim 8, wherein the digital computer is further configured to perform negative logarithmic conditional likelihood calculation according to the conditional probability of the output sample under the condition of the input sample of the labeled sample to obtain supervised learning The loss function of; the loss function of the supervised learning is converted into the second loss function using the Gordon-Thompson Golden-Thompson inequality.
  10. 根据权利要求8所述的混合计算机,其特征在于,所述数字计算机还用于根据所述无标记样本的输入样本的边际概率进行负对数条件似然计算,获取无监督学习的损失函数;对所述无监督学习的损失函数利用高登-汤普森Golden-Thompson不等式转换为所述第三损失函数。The hybrid computer according to claim 8, wherein the digital computer is further configured to perform negative logarithmic conditional likelihood calculation according to the marginal probability of the input sample of the unlabeled sample to obtain the loss function of unsupervised learning; The loss function of the unsupervised learning is converted into the third loss function by using Gordon-Thompson Golden-Thompson inequality.
  11. 根据权利要求8所述的混合计算机,其特征在于,所述第一偏导数表示为多项式,所述数字计算机还用于从样本数据集中确定预定样本,所述预定样本包括所述有标记样本或所述无标记样本;所述量子计算机用于制备所述数字计算机确定的所述预定样本的第一量子态;对所述第一量子态执行QAOA算法,获得第二量子态;对所述第二量子态测量哈密顿量对预定参数的第二偏导数作为所述第一偏导数的项。The hybrid computer according to claim 8, wherein the first partial derivative is expressed as a polynomial, and the digital computer is further configured to determine a predetermined sample from a sample data set, and the predetermined sample includes the labeled sample or The unlabeled sample; the quantum computer is used to prepare the first quantum state of the predetermined sample determined by the digital computer; the QAOA algorithm is executed on the first quantum state to obtain the second quantum state; The second partial derivative of the two-quantum state measurement Hamiltonian with respect to a predetermined parameter is used as the term of the first partial derivative.
  12. 根据权利要求11所述的混合计算机,其特征在于,The hybrid computer of claim 11, wherein:
    所述数字计算机还用于对所述预定样本获取的M次所述第二偏导数求第一平均值,将所述第一平均值作为所述第一偏导数的项。The digital computer is further configured to obtain a first average value of the M second partial derivatives obtained by the predetermined sample, and use the first average value as a term of the first partial derivative.
  13. 根据权利要求11或12所述的混合计算机,其特征在于,所述数字计算机还用于对在预定样本集合中获取的N个样本对应的第二偏导数求第二平均值,将所述第二平均值作为所述第一偏导数的项。The hybrid computer according to claim 11 or 12, wherein the digital computer is further configured to calculate a second average value of the second partial derivatives corresponding to the N samples obtained in a predetermined sample set, and calculate the The two average values are used as the term of the first partial derivative.
  14. 根据权利要求8所述的混合计算机,其特征在于,所述第一层的量子单元用于赋值有标记样本的输入样本,所述第二层的量子单元用于赋值有标记样本的输出样本时,所述第一层以及所述第二层为可见层;The hybrid computer according to claim 8, wherein the quantum unit of the first layer is used to assign the input sample of the labeled sample, and the quantum unit of the second layer is used to assign the output sample of the labeled sample. , The first layer and the second layer are visible layers;
    或者,所述第一层的量子单元用于赋值无标记样本的输入样本时,所述第一层为可见层,所述第二层为隐藏层。Alternatively, when the quantum unit of the first layer is used to assign input samples of unmarked samples, the first layer is a visible layer and the second layer is a hidden layer.
  15. 一种混合计算机,其特征在于,包括:处理器和存储器;A hybrid computer characterized by comprising: a processor and a memory;
    所述存储器用于存储计算机执行指令,当所述处理器执行所述计算机执行指令时,以使所述混合计算机执行如权利要求1-7中任意一项所述的方法。The memory is used to store computer-executed instructions, and when the processor executes the computer-executed instructions, so that the hybrid computer executes the method according to any one of claims 1-7.
  16. 一种芯片,其特征在于,包括处理器和接口;A chip, characterized in that it includes a processor and an interface;
    所述处理器用于读取指令以执行权利要求1至7中任一项所述的方法。The processor is configured to read instructions to execute the method according to any one of claims 1 to 7.
  17. 一种计算机可读存储介质,其特征在于,包括指令,当其在计算机上运行时,使得所述计算机执行如权利要求1-7任意一项所述的方法。A computer-readable storage medium, characterized by comprising instructions, which when run on a computer, causes the computer to execute the method according to any one of claims 1-7.
PCT/CN2020/077208 2020-02-28 2020-02-28 Training method for quantum boltzmann machine, and hybrid computer WO2021168798A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/077208 WO2021168798A1 (en) 2020-02-28 2020-02-28 Training method for quantum boltzmann machine, and hybrid computer
CN202080081890.7A CN114730385A (en) 2020-02-28 2020-02-28 Training method of quantum Boltzmann machine and hybrid computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/077208 WO2021168798A1 (en) 2020-02-28 2020-02-28 Training method for quantum boltzmann machine, and hybrid computer

Publications (1)

Publication Number Publication Date
WO2021168798A1 true WO2021168798A1 (en) 2021-09-02

Family

ID=77490590

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/077208 WO2021168798A1 (en) 2020-02-28 2020-02-28 Training method for quantum boltzmann machine, and hybrid computer

Country Status (2)

Country Link
CN (1) CN114730385A (en)
WO (1) WO2021168798A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861928A (en) * 2022-06-07 2022-08-05 北京大学 Quantum measurement method and device and computing equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170364796A1 (en) * 2014-12-05 2017-12-21 Microsoft Technology Licensing, Llc Quantum deep learning
CN108369668A (en) * 2015-10-16 2018-08-03 D-波系统公司 For create and using quantum Boltzmann machine system and method
CN109886342A (en) * 2019-02-26 2019-06-14 视睿(杭州)信息科技有限公司 Model training method and device based on machine learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170364796A1 (en) * 2014-12-05 2017-12-21 Microsoft Technology Licensing, Llc Quantum deep learning
CN108369668A (en) * 2015-10-16 2018-08-03 D-波系统公司 For create and using quantum Boltzmann machine system and method
CN109886342A (en) * 2019-02-26 2019-06-14 视睿(杭州)信息科技有限公司 Model training method and device based on machine learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861928A (en) * 2022-06-07 2022-08-05 北京大学 Quantum measurement method and device and computing equipment

Also Published As

Publication number Publication date
CN114730385A (en) 2022-07-08

Similar Documents

Publication Publication Date Title
US10642835B2 (en) System and method for increasing accuracy of approximating query results using neural networks
Boulesteix et al. Machine learning versus statistical modeling
Huang et al. A selective review of group selection in high-dimensional models
Nishimura et al. Prior-preconditioned conjugate gradient method for accelerated Gibbs sampling in “large n, large p” Bayesian sparse regression
Chen et al. MR-ELM: a MapReduce-based framework for large-scale ELM training in big data era
CN113795853A (en) Meta-learning based automatic feature subset selection
Erdaw et al. Machine learning model applied on chest X-ray images enables automatic detection of COVID-19 cases with high accuracy
Li et al. Using association rule mining for phenotype extraction from electronic health records
Safaei et al. E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database
US20220292876A1 (en) Systems and methods for machine learning-based identification of sepsis
WO2021168798A1 (en) Training method for quantum boltzmann machine, and hybrid computer
de Andrade Silva et al. An experimental study on the use of nearest neighbor-based imputation algorithms for classification tasks
Bhadra et al. Covid detection from cxr scans using deep multi-layered cnn
US11593673B2 (en) Systems and methods for identifying influential training data points
Fan et al. An interpretable machine learning framework for diagnosis and prognosis of COVID-19
WO2023037399A1 (en) Information processing device, information processing method, and program
Giesser et al. Implementing efficient and scalable in-database linear regression in SQL
Li et al. Incremental reduction methods based on granular ball neighborhood rough sets and attribute grouping
Hara et al. Convex hull approximation of nearly optimal lasso solutions
Chan et al. Estimation of a monotone density in s-sample biased sampling models
Jabato et al. Kernel based approaches to identify hidden connections in gene networks using NetAnalyzer
US20240028646A1 (en) Textual similarity model for graph-based metadata
Drukker et al. Assistance tools for the evaluation of machine learning algorithm performance: the decision tree based tools developed by the Medical Imaging and Data Resource Center (MIDRC) Technology Development Project (TDP) 3c effort
WO2023103754A1 (en) Thermalization state preparation method in quantum system, device, and storage medium
Yang et al. Subsampling approach for least squares fitting of semi-parametric accelerated failure time models to massive survival data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921513

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20921513

Country of ref document: EP

Kind code of ref document: A1