CN116523059A

CN116523059A - Data processing method, machine learning framework and related equipment

Info

Publication number: CN116523059A
Application number: CN202210083468.6A
Authority: CN
Inventors: 方圆; 孔小飞; 李蕾; 王汉超; 窦猛汉
Original assignee: Benyuan Quantum Computing Technology Hefei Co ltd
Current assignee: Benyuan Quantum Computing Technology Hefei Co ltd
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2023-08-01

Abstract

The invention discloses a data processing method, a machine learning framework and related equipment, which are applied to electronic equipment comprising the machine learning framework, wherein the machine learning framework comprises a data structure module, a quantum module and a classical module, and the method comprises the following steps: invoking the data structure module to acquire input data and create tensor data comprising the input data, and invoking the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the plurality of calculation layers; determining a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers; creating a computation graph comprising a sub-computation graph corresponding to the first computation layer based on the forward propagation relationship; an output result of the machine learning model is determined based on the computational graph. Based on the technical scheme, the difficulty in debugging the machine learning model can be reduced, and the development efficiency is improved.

Description

Data processing method, machine learning framework and related equipment

Technical Field

The invention belongs to the technical field of quantum computing, and particularly relates to a data processing method, a machine learning framework and related equipment.

Background

The machine learning model is widely applied to artificial intelligence research due to excellent performance, and the machine learning model can be obtained by training the machine learning model by using labeled training data, so that the machine learning model meets the expectations, and is further used for specific application work such as voice recognition, image recognition and the like. The machine learning model does not need to manually set up the standard for specific application work, and can establish corresponding work standard by training the machine learning model, thereby having better adaptability to different application works. With the development of quantum computing, machine learning models containing quantum programs are also increasing.

In the related art, in the process of developing a machine learning model containing a quantum program, the program needs to be continuously debugged, and when the machine learning model is complex, the machine learning model needs a large workload to be debugged, and the debugging efficiency is low.

Disclosure of Invention

The invention aims to provide a data processing method, a machine learning frame and related equipment, which aim to reduce the debugging difficulty of a machine learning model containing a quantum program and improve the development efficiency.

To achieve the above object, a first aspect of an embodiment of the present invention provides a data processing method applied to an electronic device including a machine learning framework including a data structure module, a quantum module, and a classical module, the method including:

Invoking the data structure module to acquire input data and create tensor data comprising the input data, and invoking the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the plurality of calculation layers;

determining a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers;

creating a computation graph comprising a sub-computation graph corresponding to the first computation layer based on the forward propagation relationship;

an output result of the machine learning model is determined based on the computational graph.

Optionally, the creating a computation graph including the sub computation graph corresponding to the first computation layer based on the forward propagation relationship includes:

determining whether the first computational layer is preceded by an unexecuted second computational layer associated with the first computational layer based on the forward propagation relationship;

if there is an unexecuted second computational layer associated with the first computational layer, executing the second computational layer, and determining a computational relationship between an output of the second computational layer and an output of the first computational layer;

and adding the sub-calculation graph corresponding to the first calculation layer to the calculation graph corresponding to the second calculation layer based on the calculation relation to obtain a new calculation graph.

Optionally, the method further comprises:

if no second computing layer is not executed and is associated with the first computing layers, creating the computing graph corresponding to the first computing layers.

Optionally, the adding the sub-computation graph corresponding to the first computation layer to the computation graph corresponding to the second computation layer based on the computation relation to obtain a new computation graph includes:

based on the calculation relation, taking the output corresponding calculation node of the first calculation layer as a subsequent node of the output corresponding calculation node of the second calculation layer, and adding the subsequent node into a calculation graph corresponding to the second calculation layer;

and adding the dependent variable corresponding computing node of the first computing layer as a precursor node of the output corresponding computing node of the first computing layer into the computing graph corresponding to the second computing layer to obtain a new computing graph.

Optionally, the determining the output result of the machine learning model based on the calculation map includes:

executing the first calculation layer based on the calculation map to obtain output of the first calculation layer;

an output result of the machine learning model is determined based on the output of the first computational layer.

Optionally, the method further comprises:

invoking the classical module to create a training layer of the machine learning model;

inputting the output result of the machine learning model into the training layer to add the corresponding sub-calculation graph of the training layer into the calculation graph based on the relation between the training layer and the machine learning model;

and updating parameters of the machine learning model based on the calculation map to obtain the trained machine learning model.

Optionally, the training layer includes a loss function layer and an optimizer layer, and the classical module includes:

a loss function unit configured to calculate a loss function of the machine learning model;

an optimizer unit configured to update parameters of the machine learning model based on the loss function when training the machine learning model to optimize the machine learning model;

the invoking the classical module to create a training layer of the machine learning model includes:

calling the loss function unit to create the loss function layer;

and calling the optimizer unit to create the optimizer layer.

Optionally, the inputting the output result of the machine learning model into the training layer to add the training layer corresponding sub-computation graph to the computation graph based on the relation between the training layer and the machine learning model includes:

Inputting the output result of the machine learning model into the loss function layer to calculate the value of the loss function of the machine learning model, and adding the calculation node corresponding to the value of the loss function as the subsequent node of the calculation node corresponding to the output result of the machine learning model into the calculation graph;

the updating the parameters of the machine learning model based on the calculation map to obtain the trained machine learning model comprises the following steps:

inputting the value of the loss function into the optimizer layer to update parameters of the machine learning model based on the value of the loss function and the computational graph when it is determined that the value of the loss function does not meet a preset condition;

determining a value of the loss function of the machine learning model after updating the parameter;

and when the value of the loss function meets the preset condition, the machine learning model after updating the parameters is used as the machine learning model after training.

Optionally, the updating the parameters of the machine learning model based on the values of the loss function and the computational graph includes:

calculating a gradient of the loss function relative to parameters of the machine learning model based on the value of the loss function and the computational graph;

Updating parameters of the machine learning model based on the gradient and gradient descent algorithm.

Optionally, the calculating a gradient of the loss function with respect to the parameter of the machine learning model based on the value of the loss function and the computational graph includes:

determining paths from the loss function corresponding computing nodes to the parameter corresponding computing nodes of the machine learning model in the computing graph;

calculating an intermediate gradient of each calculation node of the non-leaf nodes on the path relative to a predecessor node of the calculation node based on the value of the loss function;

multiplying all the calculated intermediate gradients to obtain the gradient of the loss function relative to the parameter.

In a second aspect of an embodiment of the present invention, there is provided a data processing apparatus applied to an electronic device including a machine learning framework including a data structure module, a quantum module, and a classical module, the apparatus including:

the first creating module is used for calling the data structure module to acquire input data and create tensor data comprising the input data, calling the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the calculation layers;

The determining module is used for determining a first computing layer to be executed corresponding to the tensor data from a plurality of computing layers;

a second creating module, configured to create a computation graph including computation nodes corresponding to the first computation layer based on the forward propagation relationship;

and the output module is used for determining an output result of the machine learning model based on the calculation graph.

Optionally, the second creation module is further configured to:

executing a second computational layer associated with the first computational layer when there is an unexecuted second computational layer, and determining a computational relationship between an output of the second computational layer and an output of the first computational layer;

Optionally, the apparatus further comprises:

and a third creating module, configured to create the computation graph corresponding to the first computation layer when there is no second computation layer associated with the first computation layer that is not executed.

Optionally, the second creation module is further configured to:

Optionally, the output module is further configured to:

Optionally, the apparatus further comprises:

a fourth creation module for calling the classical module to create a training layer of the machine learning model;

the input module is used for inputting the output result of the machine learning model into the training layer so as to add a corresponding sub-calculation graph of the training layer into the calculation graph based on the relation between the training layer and the machine learning model;

and the updating module is used for updating the parameters of the machine learning model based on the calculation graph to obtain the trained machine learning model.

the fourth creation module is further configured to:

calling the loss function unit to create the loss function layer;

and calling the optimizer unit to create the optimizer layer.

Optionally, the input module is further configured to:

the update module is further configured to:

Optionally, the updating module is further configured to:

In a third aspect of embodiments of the present invention, there is provided a machine learning framework, the framework comprising:

a data structure module configured to obtain input data and create tensor data comprising the input data;

A quantum module configured to create a machine learning model;

a classical module configured to create a machine learning model comprising a plurality of computational layers and a forward propagation relationship between the plurality of computational layers;

the classical module is further configured to determine a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers; creating a computation graph comprising computation nodes corresponding to the first computation layer based on the forward propagation relationship; an output result of the machine learning model is determined based on the computational graph.

A fourth aspect of an embodiment of the present invention provides a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of the method of any of the first aspects above when run.

A fifth aspect of an embodiment of the present invention provides an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of the method according to any of the first aspects above.

Based on the technical scheme, for the machine learning model created by calling the machine learning framework, among a plurality of computing layers included in the machine learning model, a first computing layer to be executed at present is firstly determined, then a computing diagram comprising a corresponding sub computing diagram of the first computing layer is created, and then an output result of the machine learning model is determined according to the computing diagram, namely, the machine learning model is immediately executed after the computing diagram is created for each computing layer, and is executed after the computing diagrams of all computing layers are not needed to be created, so that when the machine learning model is debugged, the machine learning model can be operated layer by layer, debugging is carried out according to the layer by layer operation result, the problem of the machine learning model is conveniently located, the debugging difficulty of the machine learning model is reduced, and the debugging efficiency is accelerated.

Drawings

Fig. 1 is a block diagram of a hardware structure of a computer terminal showing a data processing method according to an exemplary embodiment.

FIG. 2 is a flow chart illustrating a method of data processing according to an exemplary embodiment.

FIG. 3 is a block diagram of a machine learning framework, according to an example embodiment.

Fig. 4 is a flowchart showing a data processing method including step S23 according to an exemplary embodiment.

Fig. 5 is a flowchart showing a data processing method including step S233 according to an exemplary embodiment.

FIG. 6 is a computational diagram of a machine learning model, according to an example embodiment.

Fig. 7 is a flowchart showing a data processing method including step S24 according to an exemplary embodiment.

FIG. 8 is another flow chart illustrating a method of data processing according to an exemplary embodiment.

FIG. 9 is another flow chart illustrating a method of data processing according to an exemplary embodiment.

Fig. 10 is a block diagram illustrating classical modules comprised by a data processing device according to an exemplary embodiment.

Fig. 11 is a flowchart illustrating a data processing method including step S95 according to an exemplary embodiment.

Fig. 12 is a flowchart showing a data processing method including step S97 according to an exemplary embodiment.

Fig. 13 is a flowchart illustrating a data processing method including step S971 according to an exemplary embodiment.

Fig. 14 is a flowchart illustrating a data processing method including step S9711 according to an exemplary embodiment.

Fig. 15 is a block diagram of a data processing apparatus according to an exemplary embodiment.

Detailed Description

The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

The embodiment of the invention firstly provides a data processing method which can be applied to electronic equipment such as computer terminals, in particular to common computers, quantum computers and the like.

The following describes the operation of the computer terminal in detail by taking it as an example. Fig. 1 is a block diagram of a hardware structure of a computer terminal showing a data processing method according to an exemplary embodiment. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing a quantum-wire-based data processing method, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the data processing methods in the embodiments of the present application, and the processor 102 executes the software programs and modules stored in the memory 104 to perform various functional applications and data processing, i.e., implement the methods described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means 106 is arranged to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.

It should be noted that a real quantum computer is a hybrid structure, which includes two major parts: part of the computers are classical computers and are responsible for performing classical computation and control; the other part is quantum equipment, which is responsible for running quantum programs so as to realize quantum computation. The quantum program is a series of instruction sequences written by a quantum language such as the qlunes language and capable of running on a quantum computer, so that the support of quantum logic gate operation is realized, and finally, quantum computing is realized. Specifically, the quantum program is a series of instruction sequences for operating the quantum logic gate according to a certain time sequence.

In practical applications, quantum computing simulations are often required to verify quantum algorithms, quantum applications, etc., due to the development of quantum device hardware. Quantum computing simulation is a process of realizing simulated operation of a quantum program corresponding to a specific problem by means of a virtual architecture (namely a quantum virtual machine) built by resources of a common computer. In general, it is necessary to construct a quantum program corresponding to a specific problem. The quantum program, namely the program for representing the quantum bit and the evolution thereof written in the classical language, wherein the quantum bit, the quantum logic gate and the like related to quantum computation are all represented by corresponding classical codes.

Quantum circuits, which are one embodiment of quantum programs and weigh sub-logic circuits as well, are the most commonly used general quantum computing models, representing circuits that operate on qubits under an abstract concept, and their composition includes qubits, circuits (timelines), and various quantum logic gates, and finally the result often needs to be read out through quantum measurement operations.

Unlike conventional circuits, which are connected by metal lines to carry voltage or current signals, in a quantum circuit, the circuit can be seen as being connected by time, i.e., the state of the qubit naturally evolves over time, as indicated by the hamiltonian operator, during which it is operated until a logic gate is encountered.

One quantum program is corresponding to one total quantum circuit, and the quantum program refers to the total quantum circuit, wherein the total number of quantum bits in the total quantum circuit is the same as the total number of quantum bits of the quantum program. It can be understood that: one quantum program may consist of a quantum circuit, a measurement operation for the quantum bits in the quantum circuit, a register to hold the measurement results, and a control flow node (jump instruction), and one quantum circuit may contain several tens of hundreds or even thousands of quantum logic gate operations. The execution process of the quantum program is a process of executing all quantum logic gates according to a certain time sequence. Note that the timing is the time sequence in which a single quantum logic gate is executed.

It should be noted that in classical computation, the most basic unit is a bit, and the most basic control mode is a logic gate, and the purpose of the control circuit can be achieved by a combination of logic gates. Similarly, the way in which the qubits are handled is a quantum logic gate. Quantum logic gates are used, which are the basis for forming quantum lines, and include single-bit quantum logic gates, such as Hadamard gates (H gates, ada Ma Men), brix gates (X gates, brix gates), brix-Y gates (Y gates, briy gates), brix-Z gates (Z gates, brix Z gates), RX gates (RX gates), RY gates (RY gates), RZ gates (RZ gates), and the like; multi-bit quantum logic gates such as CNOT gates, CR gates, iSWAP gates, toffoli gates, and the like. Quantum logic gates are typically represented using unitary matrices, which are not only in matrix form, but also an operation and transformation. The general function of a quantum logic gate on a quantum state is to calculate by multiplying the unitary matrix by a vector corresponding to the right vector of the quantum state. For example, the quantum state right vector |0>The corresponding vector may beQuantum state right vector |1>The corresponding vector may be +.>

FIG. 2 is a flow chart illustrating a method of data processing according to an exemplary embodiment. Referring to fig. 2, the present embodiment provides a data processing method that can be applied to an electronic device including a machine learning framework 30 as shown in fig. 3, the machine learning framework 30 including a data structure module 31, a quantum module 32, and a classical module 33, the method including:

S21, calling the data structure module to acquire input data and creating tensor data comprising the input data, and calling the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the calculation layers.

S22, determining a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers.

S23, creating a calculation graph comprising the sub-calculation graph corresponding to the first calculation layer based on the forward propagation relation.

S24, determining an output result of the machine learning model based on the calculation map.

In particular, the machine learning framework 30 integrates a number of sets of functions for creating and training a machine learning model through which defined interfaces can be conveniently invoked to effect relevant operations on the machine learning model. As shown in fig. 3, the machine learning frame 30 may include:

a data structure module 31 configured to acquire input data and create tensor data including the input data;

a quantum module 32 configured to create a machine learning model;

a classical module 33 configured to create a machine learning model comprising a plurality of computational layers and a forward propagation relationship between the plurality of computational layers;

The classical module 33 is further configured to determine a first calculation layer to be executed, corresponding to the tensor data, from a plurality of calculation layers; creating a computation graph comprising computation nodes corresponding to the first computation layer based on the forward propagation relationship; an output result of the machine learning model is determined based on the computational graph.

Specifically, the data structure module 31 defines a data structure of tensor data, and by invoking the data structure module 31, input data can be converted into tensor data for input to the machine learning model for forward computation. Of course, in other possible embodiments, the data structure module 31 may be further configured to perform an operation on tensor data, for example, the data structure module 31 may further define a mathematical operation and a logical operation between tensor data, and may further call the data structure module 31 to create a classical computation layer of the machine learning model based on an operational relationship between tensor data, for example, a fully connected layer of the classical neural network defines a relationship between input data x and output data y through a function y=wx+b, where w and b are parameters, and by converting the input data x, the parameter w, and the parameter b into tensor data and calling the data structure module 31 to perform an operation corresponding to the function on the tensor data, a fully connected layer may be constructed.

In one possible implementation, the data structure module 31 may be configured to arrange input data according to a preset data structure to create tensor data for inputting the machine learning model, and to create tensor data for inputting the machine learning model that is arranged in the preset data structure and determined numerically. Further, in step S21, for the input data, the input data may be arranged according to a preset data structure to obtain tensor data, and the input data may be stored as a part of the tensor data. For example, the input data acquired is 1,2,3, which can be converted into a vector structure [1,2,3] as part of tensor data. It should be noted that, the input data may be data for training a machine learning model, or may be data of a class to be predicted.

Besides the data values arranged according to the preset data structure, the tensor data may further include information of tensor data obtained by calculating the data values and a gradient function of the tensor data relative to the tensor data including the data values, where the information of tensor data obtained by calculating the data values may include a variable of the tensor data, a data value storage address, a data value, and the like, so long as it indicates that the tensor data corresponding node is a precursor node of the tensor data corresponding node obtained by calculating the data values. Taking the above-mentioned function relation y=wx+b as an example, for tensor data y, it includes data values corresponding to y, such as [1,2,3], and further includes information of tensor data of w, x, b and gradient functions of y with respect to w, x, b calculated, in one possible implementation, the information may include data value storage addresses of w, x, and b, and tensor data y includes gradient functions of y with respect to w, x, and y with respect to x, and gradient function of y with respect to b 1, and further, when training the machine learning model, data values of y with respect to w, x, b are calculated by back propagation, and data values of y and corresponding gradient functions of w, x, b are obtained directly from tensor data y, and gradient values of y with respect to w, x, b are calculated by these data values and corresponding gradient functions.

Specifically, for the quantum module 32, a quantum computing layer of the machine learning model can be created by calling the quantum module 32, the quantum computing layer is a program module containing a quantum program and can be used for realizing quantum computing of a corresponding quantum program, and the quantum computing layer is convenient to use when creating and training the machine learning model by packaging the quantum program according to a certain standard. For the part of the machine learning model implemented by quantum computing, it can be understood as the corresponding quantum computing layer. Quantum program to implement quantum computing, a quantum program may be obtained by calling quantum module 32 to create quantum logic gates acting on the quantum bits in a particular order, and packaging the quantum program to obtain a quantum computing layer.

Specifically, for the classical module 33, the classical calculation layer of the machine learning model may be created by overcalling the classical module 33, and the classical calculation layer is a classical calculation part in the machine learning model, and may be obtained by encapsulating the created classical calculation program by the classical module 33 according to a certain standard, so that the classical calculation layer is convenient to use when training the machine learning model. After the quantum computing layer and the classical computing layer are created, the quantum computing layer and the classical computing layer can be packaged through a classical module 33, an abstract class layer meeting a certain standard is created, the abstract class layer is realized through a class (class) method in a programming language, a machine learning model meeting a certain standard can be created through packaging the quantum computing layer and the classical computing layer, for example, the created abstract class layer defines a forward operation machine learning model, forward operation is conveniently carried out on the machine learning model when the machine learning model is trained to obtain a computing result for computing a loss function, and meanwhile, a sequential relation for gradient computation in reverse computation can also be obtained. Classical module 33 may also be used to train a machine learning model by a training layer that creates the machine learning model.

In addition, the classical module 33 may be invoked to determine a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers; creating a computation graph comprising computation nodes corresponding to the first computation layer based on the forward propagation relationship; and determining an output result of the machine learning model based on the calculation map, and completing forward operation of the machine learning model, wherein the specific operation process can be seen from the following description of related steps in the data processing method.

In step S21, the quantum module 32 may be called to create a quantum computing layer, the classical module 33 may be called to create a classical computing layer, and then the quantum computing layer and the classical computing layer are packaged by the classical module 33 to obtain a machine learning model of mixed quantum computing and classical computing. Of course, after the quantum module 32 is called to create the quantum computing layer, the classical module 33 is directly utilized to package the quantum computing layer to obtain the pure quantum machine learning model. While for the input target data, the call data structure module 31 creates tensor data containing the input data for input to the machine learning model. The created machine learning model has a plurality of computation layers, for example, there may be a plurality of quantum computation layers, or a plurality of classical computation layers, and of course there may also be a mixed set of a plurality of quantum computation layers and classical computation layers, and the plurality of computation layers have a forward propagation relationship therebetween, so as to determine a data transfer relationship between the plurality of computation layers when the machine learning model is operated forward, that is, forward operation, for example, the output of one computation layer is the input of another computation layer.

In step S22, the calculation layer to be executed, in which the tensor data is a dependent variable, may be determined to be the first calculation layer according to the calculation relation of the plurality of calculation layers in the machine learning model. For example, the machine learning model has two calculation layers, the first is w=c×d, the second is y=w×x, and tensor data corresponding to input data is x, where x is a dependent variable in the calculation layer y=w×x, so the calculation layer may be regarded as the first calculation layer to be executed.

After the first calculation layer is determined, step S23 is performed to create a new calculation map, where the calculation map may include sub-calculation maps corresponding to the first calculation layer.

Optionally, in step S23, referring to fig. 4, creating a computation graph including a sub computation graph corresponding to the first computation layer based on the forward propagation relationship includes:

s231, determining whether the first calculation layers are preceded by unexecuted second calculation layers associated with the first calculation layers based on the forward propagation relationship.

S232, if there is an unexecuted second calculation layer associated with the first calculation layer, executing the second calculation layer, and determining a calculation relation between the output of the second calculation layer and the output of the first calculation layer.

S233, adding the sub-calculation graph corresponding to the first calculation layer to the calculation graph corresponding to the second calculation layer based on the calculation relation, and obtaining a new calculation graph.

In step S231, the output of the second calculation layer may be the input of the first calculation layer, i.e. the dependent variable, so the first calculation layer can be executed after the second calculation layer is executed, and thus it is determined whether there is a second calculation layer that is not executed before the first calculation layer according to the forward propagation relationship.

In step S232, if there is a second calculation layer that is not executed and has the foregoing association with the first calculation layer, for example, the output of the second calculation layer is the input of the first calculation layer, the second calculation layer is executed, specifically, a sub calculation map of the second calculation layer may be created first, then the sub calculation map is added to the calculation map corresponding to the executed calculation layer, and then the second calculation layer is executed based on the calculation map, so as to obtain the output of the second calculation layer. In addition, a computational relationship between the output and the output of the first computational layer, e.g., the output is a dependent variable of the output of the first computational layer, is determined.

In step S233, a sub-computation graph corresponding to the first computation layer may be created, and then the sub-computation graph is added to the computation graph corresponding to the second computation layer, so as to obtain a new computation graph.

Optionally, in step S233, referring to fig. 5, adding the sub-computation graph corresponding to the first computation layer to the computation graph corresponding to the second computation layer based on the computation relationship, to obtain a new computation graph, including:

and S2331, adding the output corresponding computing node of the first computing layer as a subsequent node of the output corresponding computing node of the second computing layer into the computing graph corresponding to the second computing layer based on the computing relation.

And S2332, adding the dependent variable corresponding computing node of the first computing layer as a precursor node of the output corresponding computing node of the first computing layer to the computing graph corresponding to the second computing layer to obtain a new computing graph.

In step S2331, the output of the first computation layer is obtained according to the output of the second computation layer, so in the computation graph corresponding to the second computation layer, the computation node corresponding to the output of the first computation layer is added to the computation graph corresponding to the second computation layer as the subsequent node of the computation node corresponding to the output of the second computation layer, and the specific implementation thereof may be referred to a graph (graph) structure of the data structure, for example, the relationship may be represented according to a linked list.

In step S2332, the output of the first computation layer is also obtained from the dependent variables other than the output of the second computation layer, so that the dependent variable corresponding computation node of the first computation layer may be added to the computation graph as a precursor node of the output corresponding computation node of the first computation layer in the computation graph to obtain a new computation graph.

For example, referring to fig. 6, the machine learning model includes a plurality of calculation layers, where the first two calculation layers are w=c×d and y=w×x, where x is tensor data, and since the tensor data x is located in the calculation layer y=w×x, the calculation layer is a first calculation layer, and a second calculation layer w=c×d is further located before the first calculation layer, when the calculation map is created, a sub calculation map 61 corresponding to the second calculation layer is created first, and the sub calculation map 61 is used as a new calculation map. When creating the sub-computation graph 61, since c and d are w dependent variables, the c-corresponding computation node 611 and the d-corresponding computation node 613 are used as precursor nodes of the w-corresponding computation node 612, the sub-computation graph 61 is created, the first computation layer is executed according to the sub-computation graph 61, and then execution of the second computation layer is entered. The first calculation layer output y corresponds to the calculation node 614 as a subsequent node of the second calculation layer output w corresponds to the calculation node 612, and is added to the second calculation layer corresponding calculation graph, namely, the sub calculation graph 61, then for the x corresponds to the calculation node 615, as x is a dependent variable of y, the calculation node 615 is used as a precursor node of the calculation node 614 and is added to the second calculation layer corresponding calculation graph, so as to obtain a new calculation graph, and the calculation graph is composed of the calculation node 611, the calculation node 612, the calculation node 613, the calculation node 614 and the calculation node 615. And the corresponding computation layer may be executed according to the new computation graph, resulting in an output of the computation layer.

Optionally, in step S24, referring to fig. 7, determining an output result of the machine learning model based on the calculation map includes:

s241, executing the first calculation layer based on the calculation graph to obtain output of the first calculation layer.

S242, determining an output result of the machine learning model based on the output of the first calculation layer.

In step S241, a formula with a forward operation may be included in the computing nodes of the computation graph, for example, for the computing nodes 614 in fig. 6, a computing formula y=w×x may be stored in the computing nodes 614, and specifically, a list corresponding to the computing nodes 614 may be created, and the computing formula may be stored in the list. And then, according to the precursor node of the calculation node 614 and the calculation formula in the calculation node 614, calculating to obtain the output of the first calculation layer. It should be noted that, referring to fig. 6, for the computation node 616 performing quantum computation, a quantum program corresponding to the corresponding quantum wire 6161 is stored in the computation node 616, and the effect of the quantum wire 6161 on the quantum bit may be equivalent to that of the unitary matrix U (x; θ).

In step S242, the sub-computation graphs of the subsequent computation layers may be added to the computation graphs of the executed computation layers according to the output of the first computation layer, and the corresponding computation layers may be executed according to the obtained new computation graph until all computation layers are executed, so as to obtain the output result of the machine learning model, for example, the computation result of the last computation layer may be the output result of the machine learning model.

Fig. 8 is another flowchart illustrating a data processing method according to an exemplary embodiment, and referring to fig. 8, the method may be applied to an electronic device including a machine learning framework 30 as shown in fig. 3, the machine learning framework 30 including a data structure module 31, a quantum module 32, and a classical module 33, the method including:

s81, calling the data structure module to acquire input data and creating tensor data comprising the input data, and calling the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the calculation layers.

S82, determining a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers.

S83, determining whether the first calculation layers are preceded by unexecuted second calculation layers associated with the first calculation layers based on the forward propagation relationship.

S84, if an unexecuted second calculation layer associated with the first calculation layer exists, executing the second calculation layer, and determining a calculation relation between the output of the second calculation layer and the output of the first calculation layer.

And S85, adding the sub-calculation graphs corresponding to the first calculation layer to the calculation graphs corresponding to the second calculation layer based on the calculation relation, and obtaining a new calculation graph.

S86, if no second computing layer which is associated with the first computing layers and is not executed is provided, creating the computing graph corresponding to the first computing layer.

S87, determining an output result of the machine learning model based on the calculation map.

Wherein, step S81 and step S82 may be referred to as step S21 and step S22, respectively, step S83 to step S85 may be referred to as step S231 to step S233, respectively, and step S87 may be referred to as step S24.

In step S83, if it is determined that there is no second computing layer associated with the first computing layers before the first computing layer, the process proceeds to step S86, where a computing graph corresponding to the first computing layer is directly created, specifically, a corresponding computing graph may be created by using an output corresponding computing node of the first computing layer as a successor node of a dependent variable corresponding computing node of the first computing layer, and an output result of the machine learning model may be determined according to the created computing graph.

Fig. 9 is another flowchart of a data processing method according to an exemplary embodiment, and referring to fig. 9, the method may be applied to an electronic device including a machine learning framework 30 as shown in fig. 3, the machine learning framework 30 including a data structure module 31, a quantum module 32, and a classical module 33, the method including:

s91, calling the data structure module to acquire input data and creating tensor data comprising the input data, and calling the quantum module and the classical module to create a machine learning model, wherein the machine learning model comprises a plurality of calculation layers and forward propagation relations among the calculation layers.

S92, determining a first calculation layer to be executed corresponding to the tensor data from a plurality of calculation layers.

S93, creating a calculation graph comprising the sub-calculation graph corresponding to the first calculation layer based on the forward propagation relation.

And S94, determining an output result of the machine learning model based on the calculation map.

S95, calling the classical module to create a training layer of the machine learning model.

S96, inputting the output result of the machine learning model into the training layer to add the corresponding sub-calculation graph of the training layer to the calculation graph based on the relation between the training layer and the machine learning model.

S97, updating parameters of the machine learning model based on the calculation map to obtain the trained machine learning model.

Wherein, steps S91 to S94 can be referred to as steps S21 to S24, respectively.

After obtaining the output result of the machine learning model, step S95 may be executed to create a training layer for training the machine learning model. Of course, the training layer may also be created at the time of creating the machine learning model, which is not particularly limited by the present invention.

Optionally, referring to fig. 10, the training layer includes a loss function layer and an optimizer layer, and the classical module 33 includes:

a loss function unit 331 configured to calculate a loss function of the machine learning model;

an optimizer unit 332 is configured to update parameters of the machine learning model based on the loss function when training the machine learning model to optimize the machine learning model.

Optionally, referring to fig. 11, in step S95, invoking the classical module to create a training layer of the machine learning model includes:

s951, calling the loss function unit to create the loss function layer.

S952, calling the optimizer unit to create the optimizer layer.

Specifically, the loss function unit 331 is configured to calculate a loss function of the machine learning model, and for example, may calculate a square of a difference between an output result of the machine learning model and the tag data as the loss function, or may calculate a binary cross entropy (Binary Cross Entropy) between the output result and the tag data as the loss function. The optimizer unit 332 may then be configured to update the parameters of the machine learning model with a gradient descent algorithm to optimize the loss function according to its gradient relative to the parameters of the machine learning model. For example, the gradient descent algorithm adopted by the optimizer may be any one of a random gradient descent algorithm (Stochastic Gradient Descent, SGD), an adaptive gradient algorithm (Adaptive gradient algorithm, adagard) and an adaptive moment estimation (Adaptive Moment Estimation, adam), and of course, other algorithms may be adopted to update parameters of the machine learning model, and the present invention is not limited in particular as to which types of loss functions can be calculated by the loss function unit 331 and which methods are adopted by the optimizer unit 332 to update parameters.

In order to realize the training of the machine learning model, step S951 may be executed, the loss function unit 331 is called to create the loss function layer, the loss function layer is a packaged calculation module, which defines a calculation mode of the loss function, and when the prediction result of the machine learning model is input to the loss function layer, the loss function of the machine learning model may be calculated according to the calculation mode defined by the loss function layer. After creating the loss function layer, the execution may proceed to step S952, where the optimizer unit 332 is called to create the optimizer layer, so that after the prediction result is input to the loss function layer and the loss function is calculated, the parameters of the machine learning model are updated according to the loss function until the appropriate parameters are obtained, so that the machine learning model can achieve the expected effect, and the optimization of the machine learning model is completed.

In step S96, when the output result is input to the training layer, it indicates that the training process for the machine learning model is started, and at this time, the foregoing calculation method may be used, and the training layer corresponding sub-calculation map may be added to the executed calculation layer corresponding calculation map according to the relationship between the training layer and the machine learning model. Then, in step S97, parameters of the machine learning model are updated according to the calculation map, and a trained machine learning model is obtained.

Optionally, in step S96, inputting the output result of the machine learning model into the training layer to add the training layer corresponding sub-computation graph to the computation graph based on the relationship between the training layer and the machine learning model, including:

and inputting the output result of the machine learning model into the loss function layer to calculate the value of the loss function of the machine learning model, and adding the calculation node corresponding to the value of the loss function as a subsequent node of the calculation node corresponding to the output result of the machine learning model into the calculation graph.

Since the value of the loss function is calculated according to the output result of the machine learning model, the value of the loss function, that is, the output corresponding calculation node of the loss function, can be added to the executed calculation layer corresponding calculation graph as the subsequent node of the output corresponding calculation node of the machine learning model. Referring to fig. 6, if the output result of the machine learning model corresponds to the calculation node 617, the calculation node 618 corresponding to the value of the Loss function Loss may be added to the calculation graph as a subsequent node of the calculation node 617.

Optionally, in step S97, referring to fig. 12, updating parameters of the machine learning model based on the calculation map, to obtain the trained machine learning model, including:

s971, when the value of the loss function is determined not to meet a preset condition, inputting the value of the loss function into the optimizer layer to update parameters of the machine learning model based on the value of the loss function and the calculation map.

S972, determining a value of the loss function of the machine learning model after updating the parameters.

S973, when the value of the loss function meets the preset condition, the machine learning model after updating the parameters is used as the machine learning model after training.

In step S971, it may be determined whether the value of the loss function satisfies a preset condition by comparing the value of the loss function with a preset threshold, for example, when it is determined that the value of the loss function is greater than or equal to the threshold, the value of the loss function is input to the optimizer layer. Of course, it may be determined by other methods that the value of the loss function does not satisfy the preset condition, as long as it can be determined by the value of the preset function that the current machine learning model does not conform to the expectation. When the preset condition is not met, the value of the loss function is input into the optimizer layer, the gradient of the loss function relative to the parameters of the machine learning model can be calculated based on the chained derivative rule by utilizing the value of the loss function and the relation between the corresponding calculation nodes of each data in the calculation graph, and the parameters of the machine learning model are updated based on the gradient descent algorithm.

In step 972, after the parameters are updated for the machine learning model, the values of their corresponding loss functions are recalculated. And re-judging whether the value of the loss function satisfies the preset condition, if not, executing step S971 may be returned, continuing to update the parameters of the machine learning model according to the value of the loss function, and if so, executing step S973 may be entered.

In step S973, when it is determined that the value of the loss function satisfies the preset condition, for example, the value of the loss function is smaller than the threshold, the difference between the output result of the machine learning model and the tag data is small, the machine learning model can achieve the expected application effect, and the machine learning model after the parameter is updated is further used as the trained machine learning model, and the parameter is stopped being updated.

Optionally, referring to fig. 13, in step S971, updating parameters of the machine learning model based on the values of the loss function and the calculation map includes:

s9711, calculating the gradient of the loss function relative to the parameters of the machine learning model based on the value of the loss function and the calculation map.

S9712, updating parameters of the machine learning model based on the gradient and gradient descent algorithm.

In step S9711, the partial derivative of the loss function with respect to its parameters may be determined, for example, to obtain a gradient of the loss function with respect to the parameters. In step S9712, parameters of the machine learning model are updated according to the obtained gradient, and are brought into a correlation formula of the gradient descent algorithm. The gradient reflects the direction of the fastest change of the loss function, and the gradient descent algorithm can be used for rapidly changing parameters, so that the value change speed of the loss function is improved, and parameters corresponding to the value of the loss function meeting the preset conditions can be rapidly found, and the machine learning model meeting the requirements is obtained.

Optionally, referring to fig. 14, in step S9711, calculating a gradient of the loss function with respect to a parameter of the machine learning model based on the value of the loss function and the calculation map includes:

s97111, determining paths from the loss function corresponding computing nodes to the parameter corresponding computing nodes of the machine learning model in the computing graph.

S97112, calculating an intermediate gradient of each calculation node of the non-leaf nodes on the path with respect to a predecessor node of the calculation node based on the value of the loss function.

S97113, multiplying all the calculated intermediate gradients to obtain the gradient of the loss function with respect to the parameter.

In step S97111, the shortest path between the two may be determined in the calculation map with the loss function as the start point and the selected parameter as the end point. Further, in step S97112, for each computation node on the path, the intermediate gradient of the computation node with respect to its predecessor node is computed, and since the non-leaf node has no predecessor node, the non-leaf node cannot compute the corresponding intermediate gradient, and the non-leaf node is typically a parameter, and the gradient does not need to be computed as the end point of the path.

After calculating the intermediate gradients, step S97113 is executed to multiply all the intermediate gradients corresponding to the paths, and the gradient of the loss function relative to the parameters thereof can be obtained according to the chain derivative rule.

Fig. 15 is a block diagram of a data processing apparatus according to an exemplary embodiment, which can be applied to an electronic device including a machine learning framework 30 as shown in fig. 3, the machine learning framework 30 including a data structure module 31, a quantum module 32, and a classical module 33, the apparatus 150 including:

a first creating module 151, configured to invoke the data structure module to obtain input data and create tensor data including the input data, and invoke the quantum module and the classical module to create a machine learning model, where the machine learning model includes a plurality of computing layers and a forward propagation relationship between the computing layers;

A determining module 152, configured to determine a first computing layer to be executed corresponding to the tensor data from a plurality of computing layers;

a second creating module 153, configured to create a computation graph including computation nodes corresponding to the first computation layer based on the forward propagation relationship;

an output module 154 for determining an output result of the machine learning model based on the computational graph.

Optionally, the second creating module 153 is further configured to:

Optionally, the apparatus 150 further includes:

Optionally, the second creating module 153 is further configured to:

Optionally, the output module 154 is further configured to:

Optionally, the apparatus 150 further includes:

Optionally, as shown in fig. 3, the training layer includes a loss function layer and an optimizer layer, and the classical module 33 includes:

an optimizer unit 332 configured to update parameters of the machine learning model based on the loss function when training the machine learning model to optimize the machine learning model;

the fourth creation module is further configured to:

calling the loss function unit to create the loss function layer;

and calling the optimizer unit to create the optimizer layer.

Optionally, the input module is further configured to:

the update module is further configured to:

Optionally, the updating module is further configured to:

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Still another embodiment of the present invention provides a storage medium having a computer program stored therein, wherein the computer program is configured to perform the steps in the data processing method embodiments described above when run.

Specifically, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.

Still another embodiment of the present invention provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of the data processing method embodiments described above.

Specifically, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Specifically, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims

1. A data processing method for application to an electronic device comprising a machine learning framework including a data structure module, a quantum module, and a classical module, the method comprising:

2. The method of claim 1, wherein the creating a computation graph comprising a sub-computation graph corresponding to the first computation layer based on the forward propagation relationship comprises:

3. The method of claim 2, wherein the method further comprises:

4. The method of claim 2, wherein adding the sub-computation graph corresponding to the first computation layer to the computation graph corresponding to the second computation layer based on the computation relationship, to obtain a new computation graph, comprises:

5. The method of claim 1, wherein the determining the output result of the machine learning model based on the computational graph comprises:

6. The method of claim 1, wherein the method further comprises:

7. The method of claim 6, wherein the training layer comprises a loss function layer and an optimizer layer, and wherein the classical module comprises:

calling the loss function unit to create the loss function layer;

and calling the optimizer unit to create the optimizer layer.

8. The method of claim 7, wherein inputting the output of the machine learning model into the training layer to add the training layer corresponding sub-computational graph to the computational graph based on a relationship of the training layer to the machine learning model comprises:

9. The method of claim 8, wherein the updating parameters of the machine learning model based on the values of the loss function and the computational graph comprises:

10. The method of claim 9, wherein the calculating a gradient of the loss function relative to parameters of the machine learning model based on the values of the loss function and the computational graph comprises:

11. A data processing apparatus for application to an electronic device comprising a machine learning framework including a data structure module, a quantum module, and a classical module, the apparatus comprising:

12. A machine learning framework, the framework comprising:

A quantum module configured to create a machine learning model;

13. A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of claims 1 to 10 when run.

14. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 10.