CN113592095A

CN113592095A - Model training method and device based on quantum computation

Info

Publication number: CN113592095A
Application number: CN202110893355.8A
Authority: CN
Inventors: 龙桂鲁; 高攀
Original assignee: Beijing Institute Of Quantum Information Science
Current assignee: Beijing Institute Of Quantum Information Science
Priority date: 2021-08-04
Filing date: 2021-08-04
Publication date: 2021-11-02
Anticipated expiration: 2041-08-04
Also published as: CN113592095B

Abstract

The embodiment of the invention provides a model training method based on quantum computation, which comprises the following steps: acquiring a training data set of a first model and initial parameters of the first model; setting quantum lines according to a Newton method aiming at the training data set and the initial parameters; and operating the quantum circuit to obtain the optimized parameters of the first model.

Description

Model training method and device based on quantum computation

Technical Field

The invention relates to the field of quantum computation, in particular to a model training method and device based on quantum computation.

Background

Recommendation systems are needed in many business areas. For example, user product recommendation, including e-commerce product recommendation, short video push, etc., is a core task of almost all resource integration type network platforms. The selection of recommended products in a recommendation system by using a machine learning model is a technical means commonly used in the industry at present. Some of these schemes, such as using a decomposition model, work well when the historical interaction information of the user with the product presents sparse features. However, since the complexity of the optimized training process, e.g., for the decomposition machine model, is linear with its parameter dimensions, the training task becomes more and more difficult to accomplish as the user and product information is enriched.

Therefore, a better model training method is needed.

Disclosure of Invention

Embodiments of the present invention provide a model training method and apparatus based on quantum computing, which can greatly reduce the consumed computing resources compared to a model training method based on a classical computer.

The invention adopts a technical scheme for solving the technical problems that, on one hand, a model training method based on quantum computation is provided, which comprises the following steps:

acquiring a training data set of a first model and initial parameters of the first model;

setting quantum wires according to a Newton method for the training data set and the initial parameters;

and operating the quantum circuit to obtain the optimized parameters of the first model.

Preferably, the setting a quantum wire according to a newton method for the training data set and the initial parameter includes:

determining a first matrix according to the training data set;

determining a loss function for training the first model according to the first matrix and the initial parameters;

the quantum wires are set based on a newton method according to the loss function.

Preferably, the quantum wire comprises at least a first module for implementing an equivalent gradient to the loss function and a second module for implementing an inverse of the second derivative of the loss function.

Preferably, the operating the quantum wire to obtain the optimized parameter of the first model includes operating the quantum wire to evolve a first quantum state representing the initial parameter into a second quantum state representing the optimized parameter. Preferably, the method further comprises the step of,

and performing projection measurement on the second quantum state, determining whether the second quantum state reaches a preset precision, and finishing the training of the first model when the second quantum state reaches the preset precision.

Preferably, the quantum wire comprises a first single-bit quantum register up, a second single-bit quantum register d, a third single-bit quantum register h, and a first multi-bit quantum register e and a second multi-bit quantum register v;

operating the quantum wire to evolve the first quantum state to a second quantum state, comprising:

putting up, d and h in quantum state |0>Putting e in a quantum state

Placing v in a first quantum state | X>Wherein χ is the number of bits;

after the up is subjected to rotation operation around the y axis with the angle eta, quantum phase estimation operation related to the equivalent gradient operator is carried out on e and v;

rotating d by the registers up and e;

performing an inverse operation of the quantum phase estimation operation of the equivalent gradient operator on e and v;

performing quantum phase estimation operation of an equivalent Hessian operator on e and v;

performing a rotation operation on h controlled by registers up and e;

performing inverse operation of quantum phase estimation operation of the equivalent Hessian operator on e and v;

after a rotation operation about the y-axis and at an angle η is performed on up, a second quantum state is obtained from v.

Preferably, the first model is a decomposer model.

A second aspect provides a quantum computing-based model training apparatus, the apparatus comprising:

a training data and initial parameter acquisition unit configured to acquire a training data set of a first model and initial parameters of the first model;

a quantum wire setting unit configured to set a quantum wire according to a newton method with respect to the training data set and the initial parameter;

and the model training unit is configured to operate the quantum wires to obtain the optimization parameters of the first model.

A third aspect provides a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.

A fourth aspect provides a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of the first aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a model training method based on quantum computation according to an embodiment of the present invention;

FIG. 2 is an overall flowchart of a model training method based on quantum computation and prediction by using the model according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a newton optimization quantum wire according to an embodiment of the present invention;

fig. 4 is a structural diagram of a model training apparatus based on quantum computing according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As mentioned above, when training some machine learning models, such as a decomposition machine model, is performed, since the complexity of the optimization process (i.e. the training process in general) of these models is linear to the parameter dimension thereof, the complexity of the training process thereof may increase rapidly as the processing data dimension becomes larger, which may result in consuming excessive computing resources. Quantum computers have potentially powerful computational capabilities, utilizing quantum processors for data processing, and in many cases have significant acceleration advantages over classical approaches. The core thought of the model training method based on quantum computation provided by the embodiment of the invention is based on a quantum processor, and a machine learning model is trained through a quantum circuit corresponding to a quantum Newton method. By the method, model training is carried out, and both quantum bit resources and computational complexity required in the variable iteration updating process of the model training are logarithmically related with respect to parameter dimensions, namely, exponential acceleration can be realized relative to classical training. This has great advantages in the training of a series of machine learning models represented by a decomposition machine. In contrast to the solutions, the method provided by the embodiments of the present specification has a faster convergence rate in the training, that is, a faster training speed.

Fig. 1 is a flowchart of a model training method based on quantum computation according to an embodiment of the present invention. As shown, the process of the method at least includes:

step 11, a training data set of a first model and initial parameters of the first model are obtained.

In this step, a data set for training a model to be trained and initial parameters of the model to be trained are obtained. In one embodiment, the initial parameters may be random. In another embodiment, the initial parameters may be determined empirically.

In one embodiment, the first model may be a decomposition machine model. In other embodiments, the first model may also be other machine learning models that can optimize model parameters in newton's method, which is not limited in this specification.

Step 12, for the training data set and the initial parameters, quantum wires are set according to a newton method.

Newton method is an optimization method of deep learning model parameters, and has the advantage of high convergence speed compared with an optimization method of gradient based on a loss function in general model training, such as a gradient descent method. The reason for this is that the training based on newton's method converges faster, i.e. faster training speed and more efficient, because the training based on newton's method converges to the second order of the loss function and the gradient descent converges to the first order of the loss function.

The quantum wires are set according to the newton method, and the loss function of model training needs to be obtained first. Thus, in one embodiment, a first matrix may be determined from the training data set; determining a loss function for training the first model according to the first matrix and the initial parameters; the quantum wires are set based on a newton method according to the loss function.

In one embodiment, a quantum wire may include at least a first module to implement an equivalent gradient to the loss function and a second module to implement an inverse of a second derivative of the loss function.

And step 13, operating the quantum wires to obtain the optimized parameters of the first model.

In this step, the quantum state representing the initial parameter may be evolved into the quantum state of the optimized parameter based on the quantum wire.

In one embodiment, the quantum wire may be operated to evolve a first quantum state representing an initial parameter to a second quantum state representing an optimized parameter.

In one embodiment, a projection measurement may be performed on the second quantum state, and it is determined whether the second quantum state reaches a predetermined precision, in which case the training of the first model is ended.

In one embodiment, the first quantum wire may include a first single-bit quantum register up, a second single-bit quantum register d, a third single-bit quantum register h, and a first multi-bit quantum register e and a second multi-bit quantum register v.

The first quantum state is evolved to a second quantum state by:

putting up, d and h in quantum state |0>Putting e in a quantum state

Placing v in a first random quantum state | X>Wherein χ is the number of bits;

rotating d by the registers up and e;

performing a rotation operation on h controlled by registers up and e;

after the up is rotated around the y axis by the angle eta, Z-base projection measurement is respectively carried out on the up, d and h, if the quantum states of the up, d and h are |0>_up|0>_d|0>_hAnd a second quantum state is obtained from v.

In one embodiment, a prediction for the data to be measured may be made based on the obtained second quantum state. In one example, data to be tested may be obtained, and a second matrix may be obtained according to the data to be tested; obtaining a fourth quantum state, the fourth quantum state being obtained according to the method shown in fig. 1; performing quantum measurement on the second matrix aiming at the fourth quantum state to obtain a first result; and outputting the first result as a prediction result according to the data to be detected.

The following further illustrates, by a complete embodiment, how embodiments of the present invention provide a model training method based on quantum computation, and how to use the trained model for prediction. Fig. 2 is an overall flowchart of a model training method based on quantum computation and prediction by using the model according to an embodiment of the present invention. As shown in fig. 2, the process includes the following steps:

step I, a loss function of a first model, for example a decomposer model, can be written in the form of a matrix product, the mathematical expression of which is

Wherein the parameter to be optimized (i.e. the initial parameter) is represented as vector X ═ 1, X₁,x₂,…,x_d)^TIn the form of p is the number of vector parameters, x₁-x_dIs the vector element of X, A is the matrix form (i.e. the first matrix) of the training data, and A is the symmetric matrix. Note that embodiments of the present specification provide a model training method based on quantum computation in one embodiment, the training data may be mapped to a symmetric matrix a, for example, by a classical computer.

Step II, a quantum circuit corresponding to the quantum newton method and composed of at least several quantum gates is set for the loss function f (for example, as shown in fig. 3, which will be described in detail later).

And III, determining a direct product quantum state | X >, and evolving the quantum state | X > to an updated quantum state | X' >, for the quantum circuit.

And IV, performing quantum measurement on the obtained updated quantum state | X' > and judging whether the optimization precision is met: if so, the process is stopped and the updated state | X' > is output. If not, the updated state | X' > is taken as the new input quantum state | X >, and the step 3 is continuously executed;

and IV, rearranging and mapping the data to be predicted into a symmetric matrix B, performing quantum measurement on the output quantum state | X ' > obtained in the step 4, wherein the mathematical expression of the quantum measurement is < X ' | B | X ' >, and outputting the result. The result is the recommended predicted value of the first model. In one embodiment, the data to be predicted may be, for example, collected user history data. In one embodiment, the data to be predicted may also be remapped to a symmetric matrix B, for example by a classical computer.

From the above description, it can be seen that the core of the model training method based on quantum computation provided by the embodiment of the invention lies in realizing the quantum state | X>To quantum state | X'>The evolution process of quantum dynamics. In one embodiment, the mathematical expression of the evolution process of quantum dynamics may be, | X'>∝|X>-ξ(KHK)^-1D|X>Where D is the equivalent gradient operator of the loss function (polynomial function). In one example, the equivalent gradient operator may be obtained from the input data matrix a via a matrix transformation. KHK is the equivalent Hessian matrix (a matrix consisting of the second partial derivatives of the loss function) of the loss function (polynomial function). In one example, the equivalent Hessian matrix may also be obtained by matrix transforming the input data matrix a. ξ is the preset iteration step.

Fig. 3 is a schematic diagram of a newton optimization quantum wire according to an embodiment of the present invention. It is further described below how to implement a quantum wire as shown in fig. 3 on a quantum processor to achieve a slave quantum state | X>To quantum state | X'>The evolution process of quantum dynamics. In FIG. 3, R_yRepresenting a single bit quantum gate operation rotating about the y-axis, H representing a Hadamard quantum gate operation, U_FAnd

representing a quantum Fourier transform operation and a quantum inverse Fourier transform operation, respectively, e^-iDtAnd e^-iKHKtGates are simulated for the first and second Hamiltonian quantities.

Quantum wires, i.e., wires consisting of quantum logic gates, operate on quantum bits. Stage-1 in fig. 3 may correspond to the first module described previously. In one embodiment, the first module may be implemented using a quantum module having a HHL structure. Stage-2 in fig. 3 may correspond to the second module described previously. In one embodiment, the second module may be implemented using a quantum module having a HHL structure. In other embodiments, the quantum wire may further include a quantum state initialization module, a measurement and determination result module. In consideration of the insensitivity of the initial parameter values in the iterative optimization of the training and in order to improve the execution efficiency of the operation, in one embodiment, the quantum state initialization module may initialize the quantum state X to a simple direct product state or other quantum state that is easy to prepare during the first iteration of the training. In the subsequent iteration process, the updated state of the quantum state X output in the previous iteration step is used as the input quantum state of the step.

The implementation of the quantum wire is further illustrated below by a specific embodiment, which may include the following steps:

first, in step A, the single-bit quantum registers up, d, h are all placed in quantum state |0>With multibit quantum registers e set in quantum states

χ is the number of bits contained in the register, and may be preset in one example. In one embodiment, during the first iteration of training, the multibit quantum register v can be placed in some arbitrary easy-to-prepare quantum state | X>. And in the subsequent iteration process, the state of the register is the previous output state.

Then, in step B, after a rotation operation around the y-axis and at an angle η is performed on the single-bit register up, a quantum phase estimation operation is performed on the equivalent gradient operator D on registers e and v. The rotation operation in this step functions to determine a superposition coefficient at the time of state superposition. Thus, | X 'in the quantum-kinetic evolution process as described above'>∝|X>-ξ(KHK)^-1D|X>In an embodiment of (1), the rotation operation may be used to determine an iteration step ξ for training.

In one embodiment, the mathematical expression of the equivalent gradient operator D is

Wherein, P_kThe permutation operation is exchanged for the 1 st and k th redo in p-weight direct product space. In one example, referring to the evolution process of the equivalent gradient operator D, the | X can be backed up at the time of consuming multiple quantum states according to a quantum principal component analysis method or a hamiltonian evolution method based on quantum signal processing>At the cost of (c).

In one embodiment, the quantum phase estimation operation with respect to the equivalent gradient operator D may include: performing Hadamard gate operation on e; e-controlled first hamiltonian analog gate operation on v; and performing quantum Fourier transform operation on the e. Wherein the mathematical expression of the first Hamiltonian simulation gate operation is e^-iDtD is the equivalent gradient operator, t is time, e is the natural logarithm, and i is the imaginary unit. After this quantum phase estimation operation, the phase of the representation D on v is inverted to the quantum state of the representation D on e. The effect is that, although the equivalent gradient operator D itself is not a good operator, the state on e is a quantum state representing the equivalent gradient operator D through the above process.

Next, in step C, a rotate operation is performed on the single bit register d, controlled by registers up and e. The rotation operation in this step functions to control D according to the quantum state on e (used to represent D, which may be specifically the intrinsic state | λ > representing the quantum state of D in one example) and the state of up, so that D is associated with e and up. In one example, where register up state is enabled at |1>, state | λ > of e is used to control the rotation angle.

Then, in step D, the inverse of the quantum phase estimation procedure described in step B with respect to the equivalent gradient operator D is performed to erase the state of register e and place it back in the quantum state

In one embodiment, the inverse operation may include: carrying out quantum inverse Fourier transform operation on the e; inverting v with a first hamiltonian analog gate operation controlled by e; hadamard gate operation is done on e. Wherein, the first HamiltonThe mathematical expression for the inverse of the simulated gate operation is e^iDtD is the equivalent gradient operator, t is time, e is the natural logarithm, and i is the imaginary unit.

Next, at step E, a correlation is made on E and v with respect to the equivalent Hessian matrix

Where K ═ diag (0,1, 1.. 1) is a diagonal matrix, in one embodiment, K may be equivalently implemented by adjusting the corresponding quantum black box operation in a quantum signal processing based hamiltonian simulation process.

And

respectively exchange the 1 st weight and the k th weight₁Heavy, 2 nd heavy and k th₂Permutation operation of the weighted directly-multiplied space, S is permutation operation of exchanging the 1 st and 2 nd weighted subspaces, | X>The parameter X to be optimized is in a quantum state form, p is the vector parameter quantity, and A is a matrix form of training data.

In one embodiment, the quantum phase estimation operation with respect to the equivalent Hessian matrix may include: performing Hadamard gate operation on e; e-controlled second hamiltonian analog gate operation on v; and performing quantum Fourier transform operation on the e. Wherein the mathematical expression of the second Hamiltonian analog gate operation is e^-iKHKtKHK is the equivalent Hessian matrix, t is time, e is the natural logarithm, and i is the imaginary unit.

Then, in step F, a rotation operation controlled by registers up and e is performed on the single bit register h. The rotation operation in this step functions to control h according to the quantum state on e (representing KHK, which may be specifically the intrinsic state | λ > representing the quantum state of KHK in one example) and the state of up, so that h is associated with e and up. In one embodiment, state | λ > of e may be used to control the rotation angle, which may be enabled when register up state is at |1 >.

Subsequently, in step GThe inverse of step E is done, i.e. the inverse of the quantum phase estimate on the equivalent Hessian matrix is done on E and v. In one embodiment, the inverse operation may include: carrying out quantum inverse Fourier transform operation on the e; performing an inverse operation of the second Hamiltonian analog gate controlled by e on v; hadamard gate operation is done on e. Wherein the mathematical expression of the inverse operation of the second Hamiltonian analog gate operation is e^iKHKtKHK is the equivalent Hessian matrix, t is time, e is the natural logarithm, and i is the imaginary unit.

Finally, in step H, after the single-bit up is rotated around the y axis by the angle eta, Z-base projection measurement is respectively carried out on the single-bit registers up, d and H, and the state |0 is screened out>_up|0>_d|0>_hAt this time, the register v is in the updated quantum state | X 'after the iteration of the quantum Newton method'>∝cos²(η)|X>-sin²(η)(KHK)^-1D|X>And completing one iteration of the quantum Newton optimization.

It should be noted that before step H, e, v, since applied are both corresponding operations and inverse operations, the states can be equivalent to the operations without applying phases-1 and-2, i.e. | X>. After the rotation operation on up and the projection measurements on up, d, h, respectively, their states evolve to | X' ″ due to the correlation between v and up, d, h (established by step C, F)>I.e. cos²(η)|X>-sin²(η)KHK^-1D|X>。

It should also be noted that since the projection test is a destructive test, the state of all three is |0 if the projection measurements for up, d, h are not measured>_up|0>_d|0>_hIf all the states are |0 >, the updated quantum state is not obtained and an iteration is completed.

The model training method based on quantum computation provided by the embodiment of the invention has the following advantages: on one hand, compared with a model training method based on a classical computer, the computational complexity can only reach the order of linear correlation with respect to variable dimensions at the lowest. The model training method based on quantum computation provided by the embodiment of the invention can reduce the training complexity to the magnitude related to the logarithm of the variable dimension, so that the computation resources consumed in the training process are less. On the other hand, compared with a model training method based on quantum gradient optimization, the method provided by the embodiment of the invention can achieve convergence faster in training, so that the model training speed is also faster.

According to an embodiment of another aspect, there is also provided a computer-readable medium comprising computer-executable instructions stored thereon, wherein the computer-executable instructions, when executed on a quantum computer, cause the quantum computer to perform the method shown above.

According to an embodiment of another aspect, there is also provided a computing device comprising a memory having stored therein executable code, and a processor that, when executing the executable code, implements the method described above.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A model training method based on quantum computation comprises the following steps:

2. The method of claim 1, wherein the setting quantum wires according to newton's method for the training dataset and the initial parameters comprises:

determining a first matrix according to the training data set;

3. The method of claim 2, wherein the quantum wire comprises at least a first module for implementing an equivalent gradient to the loss function and a second module for implementing an inverse of a second derivative of the loss function.

4. The method of claim 2, wherein running the quantum wire to obtain the optimized parameters of the first model comprises running the quantum wire to evolve a first quantum state representing the initial parameters to a second quantum state representing the optimized parameters.

5. The method of claim 4, further comprising,

6. The method of claim 4, wherein the quantum wires comprise a first single-bit quantum register up, a second single-bit quantum register d, a third single-bit quantum register h, and a first multi-bit quantum register e and a second multi-bit quantum register v;

putting up, d and h in quantum state |0>Putting e in a quantum state

Placing v in a first quantum state | X>Wherein χ is the number of bits;

rotating d by the registers up and e;

performing a rotation operation on h controlled by registers up and e;

7. The method of claim 1, wherein the first model is a decomposer model.

8. A quantum computing-based model training apparatus, the apparatus comprising:

9. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-7.

10. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-7.