WO2020245013A1

WO2020245013A1 - Artificial neural network on quantum computing hardware

Info

Publication number: WO2020245013A1
Application number: PCT/EP2020/064764
Authority: WO
Inventors: Daniele BAJONI; Dario GERACE; Chiara MACCHIAVELLO; Francesco TACCHINO
Original assignee: Universita' Degli Studi Di Pavia
Priority date: 2019-06-04
Filing date: 2020-05-27
Publication date: 2020-12-10

Abstract

The invention concerns artificial neural networks on quantum computing hardware via a circuit capable of performing neuromorphic operations in a quantum computing environment. It is based on a recent demonstration of a quantum computing version of a single artificial neuron, that performs exponentially better than the classical counterpart when considering the amount of information it can store and elaborate. The present invention combines such neurons via the use of quantum synapses into an artificial (e.g. deep) neural network that will show the same plasticity and ability to undergo machine learning as classical neural networks, while still retaining the exponential advantage coming from the quantum neurons.

Description

Artificial neural network on quantum computing hardware

Applicant: Universita degli Studi di Pavia

Inventors: Daniele Bajoni, Dario Gerace, Chiara

Macchiavello, Francesco Tacchino

The present invention concerns the field of Artificial neural networks on quantum computing hardware .

Background art

With the term "classical computer" one generally means any deterministic Turing machine, such as a laptop, a workstation, a CPU, or a GPU, or such as it may be found in smartphones and other electronic appliances, servers, supercomputers, and so on; it is a machine capable of Boolean operations on binary units of information called bits. A common state-of-art classical computer can treat strings of bits much larger than 1 billion elements (Gbits) and perform much more than 1 billion operations per second (Gflop) . In the case of the best existing supercomputers, these figures can increase by as much as almost nine orders of magnitude.

On the other hand, quantum computers work under completely different paradigms. They are based on qubits instead of bits, physical objects that can be placed in the superposition of two logical quantum states, and are poised to outperform any existing classical machine via the exploitation of quantum mechanical effects like entanglement . Artificial Intelligence and machine learning are bringing about a revolution on how we treat data, performing tasks like speech recognition, pattern recognition and data classification with unprecedented efficiency. Artificial Intelligence is expected to undertake a significant leap ahead via the use of quantum computers, a radically new paradigm of information processors that harness quantum physics to elaborate data with exponential speed-ups with respect to traditional computers. Quantum computers have long been considered futuristic machines, but they are currently starting to be developed by big IT corporations such as Microsoft, Google, Intel, and IBM, as well as startups as Rigetti, Xanadu, and IonQ.

Quantum computers are machines that may soon grow to offer a computing advantage far surpassing predictions from Moore's Law. Indeed, quantum computing has been theoretically shown to offer speedups that can be exponential over traditional machines, in tasks like number factoring, solving linear systems and data classification .

Artificial Neural networks, the computing systems at the basis of AI, could be realized with exponentially improved efficiency on a quantum computer. This exponential improvement was demonstrated for the case of a single neuron by our group in a recent result [1], a paper that is here integrally incorporated by reference. Interest in Quantum Artificial Intelligence mainly stems from the possibility of elaborating an amount of data far larger than what is doable using conventional Neural Networks .

In particular, the ability of building artificial neural networks capable of exploiting the exponential advantage of quantum computers will be advantageous to elaborate a large amount of data, such as very large image files, sanitary data for public health, market data for financial applications, the "data deluge" expected from the Internet of things, and all the fields that have benefited from the application of machine learning algorithms so far.

The need is felt of advantageously implement artificial (e.g. deep) neural networks on quantum computing hardware. Neural networks are ultimately limited by the computational power and the memory available on the hardware implementing them. Memory in particular can be a limiting factor when dealing with massively parallel computing systems, as those implemented using GPUs or in supercomputers. Quantum processors are able to elaborate quantities of data that will surpass conventional computers in a matter of few years. Moreover, quantum computing combines data processing and memory on the same device thus overcoming the existing bottlenecks on classical hardware. As in classical artificial networks, only an artificial network structure of connected multilayers can allow solving a large class of complex tasks in feedforward configuration. Analogously, only an artificial (e.g. deep) quantum network has the potential expected to become an effective tool, however how to design such a network with suitable synapses is still unknown.

The classical model of a perceptron in classical neural networks is depicted in Fig. 1 (a) .

With respect to this model, in their paper published in NPJ Quantum Information, Kwok Ho Wan and coworkers [8] detail the method they propose to implement neural networks on quantum computers. The method they employ consists in directly representing neural networks as it would be done in a classical computer, by directly representing the value of a classical bit on a quantum qubit. Differently from classical machines, quantum computers require reversible operations, and this constitutes a problem for the authors. They obviate this problem by using additional qubits (common practice in quantum computing algorithms, these additional qubits are usually called "ancillas") to store the information that in a classical computer would be erased (an operation impossible on a quantum computer) . The authors do not give any general indication on the number of necessary ancillas, and in the only example given in the text they add approximatively two ancillas per qubit. Let us, for simplicity assume the number of ancillas is proportional to the number of qubits carrying the information. As an example, suppose now to represent a string of 8 classical bits of information (i.e., a simple byte), the string

As it is clear from the section "Reversible -> Unitary" and Fig. 1 in the paper, this method will employ 8 qubits plus the ancillas. The quantum state of the system would then explicitly be written as

Where the states represent the states of the

ancillae. When dealing with real valued numbers, classical computers generally treat them by discretizing the real numbers using multiple bits, like for instance the 256 bit encoding of grey scale pixels. Therefore, the method employed in D1 would clearly need more than 256*8=2048 qubits to treat 8 real valued numbers discretized with 8 bits encoding. In general, it is clear that in [8] representing ten million classical bits (for instance an image from a present day camera, for example) would require more than ten million qubits, representing and elaborating 10 billion classical bits (for instance "big data" datasets on which to perform pattern recognition) would require more than 10 billion qubits and so on. Current quantum computers have tens of qubits, no more than a few thousands of qubits are expected for the machines of the next few decades. It may be expected that 1 billion qubits is beyond futuristic expectations.

The authors in [8] then proceed to describe how to combine the neurons that closely mimic classical neurons in networks that are exactly the classical networks run on quantum computers reduced to very inefficient classical computers. As the qubits are reduced to bits, information is passed from one layer to another simply by passing the qubits, as it can be clearly seen in Fig. 5 there. As the quantum computer is used exactly as a classical one, the network can be trained exactly like classical neural networks by searching for optimal values of the individual qubits (individual is here the fundamental word) . In essence following this procedure quantum computers become very expensive and inefficient classical CPUs in which quantum effects are at most an inconvenience .

Instead, in Fig. 1(b) taken from [1], the perceptron model is radically different.

The authors initially limited the input and weight vectors to binary values, as in McCulloch-

Pitts neurons. Hence, a m-dimensional input vector is encoded using the m coefficients needed to define a general wavefunction of N qubits. In practice, given

arbitrary input and weight vectors:

with the following two quantum states are

defined:

The states form the so-

called computational basis of the quantum processor, i.e., the basis in the Hilbert space of N qubits, corresponding to all possible states of the single qubits being either in

As usual, these states are labeled with integers arising from the

decimal representation of the respective binary string. Evidently, if N qubits are used in the register, there are m = 2^N (therefore N = log₂m) basis states labeled

and, as outlined in Eq. (2), we can use factors +1 to encode the m-dimensional classical vectors into a uniformly weighted superposition of the full computational basis.

The first step of the algorithm prepares the state by encoding the input values in

Assuming the qubits to be initialized in the state we perform

a unitary transformation such that

In principle, any mx m unitary matrix having

in the first column can be used to this purpose. Notice that, in a more general scenario, the preparation of the input state starting from a blank register might be replaced by a direct call to a quantum memory where was

previously stored.

The second step computes the inner product between

and w using the quantum register. This task can be performed efficiently by defining a unitary transformation, U_w, such that the weight quantum state is rotated as

As before, any mxm unitary matrix having

in the last row satisfies this condition. If we apply U_w after U_i, the overall N-qubits quantum state becomes

Using Eq. (4), the scalar product between the two quantum states is:

and from the definitions in Eq. (2) it is easily seen that the scalar product of input and weight vectors is

Therefore, the desired result is contained, up to a normalization factor, in the coefficient c_m-1 of the final state In order to

extract such an information, the authors in [1] propose to use an ancilla qubit (a) initially set in the state |0) . A multi-controlled NOT gate between the N encoding qubits and the target a leads to:

The nonlinearity required by the threshold function at the output of the perceptron is immediately obtained by performing a quantum measurement : indeed, by measuring the state of the ancilla qubit in the computational basis produces the output (i.e., an

activated perceptron) with probability As he

authors in [1] show, this choice proves simultaneously very simple and effective in producing the correct result. However, it should be noticed that refined threshold functions can be applied once the inner product information is stored on the ancilla.

It is also noticed that both parallel and anti parallel

vectors produce an activation of the perceptron, while orthogonal vectors always result in the ancilla being measured in the state This is a

direct consequence of the probability being a quadratic function, i.e., in the present case, at

difference with classical perceptrons that can only be employed as linear classifiers in their simplest realizations. In fact, the quantum perceptron model in [1] can be efficiently used as a pattern classifier since it allows to interpret a given pattern and its negative on equivalent footing. Formally, this intrinsic symmetry reflects the invariance of the encoding

states under addition of a global —1 factor. With this model, and considering the string

above for [8], The quantum state of the system would then explicitly be written as

where the states and so on are the

collective states of the three qubits (the so called "computational basis"), and could be fully

entangled. This means that 3 qubits can represent and be used to elaborate 8 real values (encoded, for instance, as the phase or amplitude of the complex coefficients) . The exponential advantage in this representation then becomes clear: 1 million real numbers would need 20 qubits, 1 billion real numbers 30 qubits, 1 trillion real numbers 40 qubits and so on. While in [8] Kwok et al . only explore an exponentially small part of the Hilbert space of the qubits they employ, the representation in [1] use the full Hilbert space obtaining storage and elaboration advantage.

However, Ref. [1] does not teach how to build a quantum artificial neural network starting from a quantum representation of a neuron, i.e. how to create quantum synapses and how to train the ensuing network.

Object and subject-matter of the invention

It is the object of the present invention to solve the problems and to overcome the drawbacks of the prior art .

It is subject-matter of the present invention a system and a method according to the appended claims, which are an integral part of the present description.

Detailed description of invention's embodiments

List of figures

The invention will be now described by way of illustration but not by way of limitation, with specific reference to the drawings of the enclosed figures, wherein :

— Fig. 1 shows Perceptron models. (a) Schematic outline of the classical perceptron as a model of artificial neuron: An input array ΐ is processed with a weight vector w to produce a linear, binary valued output function. In its simplest realization, also the elements of

are binary valued, the perceptron acting as a binary (linear) classifier, (b) Scheme of the quantum algorithm for the implementation of the artificial neuron model on a quantum processor [1] : From the system initialized in its idle configuration, the first two unitary operations prepare the input quantum state, and implement the U_w transformation,

respectively. The final outcome is then written on an ancilla qubit, which is eventually measured to evaluate the activation state of the perceptron;

— Fig. 2 shows a schematic representation of the concept of a quantum deep neural network. The circles represent quantum neurons as described in [1], the number of inputs, the number of layers, the number of neurons in the layers and the number of synapses for each neuron are purely representative;

— Fig. 3 shows in (a) a simplified, elementary version of the quantum deep neural network, for which a quantum circuit representation is drawn in (b) and Fig. 4;

— Fig. 4 shows an example of a quantum circuit model for the realization of the elementary deep neural network outlined in Fig. 3; Input data are encoded into quantum states unitary operations

are performed along the circuit to implement weight vectors (quantum state not shown in the

Figures) and their scalar product. Vertical lines with circles represent controlled and multi- controlled operations, H represent Hadamard gates, X a NOT operation. Results are obtained by measurement of the ancilla qubits (represented by the gauge symbol

all these are per se standard operations in quantum computing [5] ;

— Fig. 5 shows the same as Fig. 4, but with a classical input. The classical information is input in the network via the unitary operations

— Fig. 6 shows chosen inputs of an experiment with the method of the invention, wherein are all possible 2x2 black and white pixel figures (the numbers on the x axis indicate the figures as specified in the legend) ;

— Fig. 7 shows a graph with the output measured for an exemplary application of the invention method to the elementary deep neural network outlined in Fig. 3 and with the inputs of Fig. 6; and

— Fig. 8 shows the trend of training of the elementary deep neural network outlined in Fig. 3 and with the inputs of Fig. 6, wherein a series of positive and negative training cases were fed to the algorithm weighting a classically defined cost function; such a trend clearly shows that after a few thousand cases the algorithm manages to minimize the cost function, finding the correct weights of the network .

It is here specified that elements of different embodiments can be combined together to provide further unlimited embodiments respecting the technical concept of the invention, as the person skilled in the art will directly and unambiguously understand from what has been described.

The present description also refers to the prior art for its implementation, with respect to the non- described detailed features, such as for example elements of minor importance usually used in the prior art in solutions of the same type.

When an element or a plurality of elements are introduced, it always means that it can be "at least one" or "one or more" .

When listing a list of elements or features in this description it is meant that the invention according to the invention "includes" or alternatively "is composed of" such elements. By "computer", any processor or hardware logic is intended, such as CPU, GPU or the like and/or quantum hardware, as a single unit or used as a network of (similar or different) units.

Embodiments

According to prior art, an artificial neural network or a system of computing hardware configured to implement it comprises a plurality of layers including:

— An input layer comprising one or more m-dimensional input data vectors (which can include classical data and/or quantum data), with m a positive integer;

— A set of neurons:

comprising one or more hidden layers constituted by a plurality of hidden neurons;

comprising an output layer constituted by a plurality of output neurons providing output data vectors; and

each comprising a respective neuron register of data including one or more sets ofm coefficients;

— A set of connections between the neurons;

— A respective unitary weight function configured to

act on the coefficients of the neuron register; and

— An associated training algorithm configured to change the respective unitary weight function on the basis

of (depending on) the output data vectors.

The one or more m-dimensional input data vectors can be an only input data vector, fed to each neuron of the first hidden layer in parallel or fed in split portions to different neurons of the first hidden layer.

This description is about an innovation to realize artificial neural networks on quantum computing hardware via a circuit capable of performing neuromorphic operations in a quantum computing environment. It is based on another recent result from our group, the demonstration of a quantum computing version of a single artificial neuron [1] . The inventors have demonstrated that such neurons perform exponentially better than their classical counterpart when considering the amount of information they can store and elaborate. The present invention combines such neurons via the use of quantum synapses into a deep neural network that will show the same plasticity and ability to undergo machine learning as classical neural networks, while still retaining the exponential advantage coming from the quantum neurons.

In general, the artificial neural network of the present invention is such that the set of neurons above comprises :

— A non-void subset of quantum artificial neurons implemented on quantum computing hardware (i.e. at least a quantum artificial neuron) ; and

— A subset of classical artificial neurons implemented on a classical computing hardware.

According to an aspect of the invention, the artificial neural network can be such that the subset of quantum artificial neurons constitute one or more complete layers in the plurality of layers, the remaining layers being layers of classical neurons. In case the subset of classical artificial neurons is void, i.e. there are no classical artificial neurons, a complete quantum artificial neural network is herein described.

An example on how to build a deep neural network with a quantum circuit is shown in fig. 2, according to the present description. The circles represent quantum neurons, realized as outlined in reference [1] or in any equivalent way encoding an m-dimensional input vector with N qubits where m = 2^N. According to an aspect of the invention, such a structure is in strict analogy to classical feed forward neural networks, and neurons represented in the same vertical column form in a layer.

The input can be a vector (or a plurality of vectors when the network has parallel input nodes columns; in this case each vector can have a different dimension, i.e. a different value of m, and N is conveniently corresponding to the maximum value among the vector dimensions) containing the values of the information to be elaborated. This input can come from classical channels, for example an image, an audio file, a complex pattern from an Internet of Things application or any other source containing data of any kind. However, the input can already be in the form of quantum information: this is the case if the input comes, for instance, from a quantum memory, or is the output of a preceding quantum computation or is transmitted in the form of a wave function via some quantum channel.

The input vector can be in binary form but also with real and continuous values or even with complex values, as a generalization of [1] . The elements of the input vectors are encoded [4,7,1] in the network as coefficients of the computational basis vectors (or any other convenient choice of basis) of the qubits used for the quantum neurons. A set of N qubits encode m = 2^N entries of the input vector. It is clear that when mapping a real computation on N qubits the actual data can be less than the allocated bit sting, and therefore some of the entries may be replaced by suitable zeros. It is also clear that N is conveniently the smallest integer higher than log₂ m, i . e . N = ceiling(log₂ m).

In general, according to an aspect of the invention, the coefficients in each neuron can be defined on a different basis of quantum states, but the basis can be the same for all quantum artificial neurons.

If the input is classical, this encoding can be obtained by performing a suitable unitary operation on the qubits of each neuron, or it can be directly fed to the network if already in the form of a quantum wave function. The input can be fed to each neuron of the first hidden layer in parallel or split and different parts be fed to different neurons.

In general, according to an aspect of the invention, in the non-void subset of quantum artificial neurons:

— The above respective neuron register of data is a respective set on N neuron register qubits representing said set of m coefficients as coefficients of a superposition of m basis quantum states, wherein N is derived from the relationship m = 2^N; ; — If a neuron receives input from a classical artificial neuron or from data of the input data vector, the neuron comprises a respective (unitary) encoding function

configured to encode the data into the respective register qubits;

— The respective (unitary) weight function is

configured to act on the coefficients of the respective register qubits; and

— Each neuron includes one or more respective ancilla qubits .

The connection between successive layers of the quantum neural network is obtained by the invention through quantum synapses.

The "quantum synapses" can be defined as controlled operations between the qubits encoding single artificial neurons of a given layer and the ones representing the neurons of the respective successive layer, in such a way that the input information on the latter is determined by the activation state of the previous one. In case information on the activation status of each layer is not required, the global activation state of the network thus built can be determined by only performing measurements on the qubits encoding neurons and/or their ancillae in the output layer.

In some cases, the classical feedforward of information between successive layers can be directly reformulated in terms of fully quantum coherent synapses according to the principle of deferred measurement [5] , which states that a measurement performed at and intermediate stage of a quantum circuit can always be moved to the end of the computation by replacing classically controlled operations with a quantum controlled one. Notice that any generic multi-controlled quantum operation can be actually used to implement the proposed quantum synapses, irrespective of the specific available hardware and even without any classical equivalent .

In general, according to an aspect of the invention, one or more connections of said set of connections are quantum artificial synapses starting from respective elements of said non-void subset of quantum artificial neurons and ending on an end neuron belonging to said set of neurons in a successive layer. Each of the one or more connections:

— Can include at least a respective measurement means for measuring said respective one or more ancilla qubits of the respective elements;

— When said end neuron belongs to said non-void subset of quantum neurons, it can include respective controlling operations means (which are hardware means) configured to apply, upstream the respective measurement means (which are hardware means) , a unitary operation on the one or more respective ancilla qubits of the respective elements of said non void subset (e.g. starting neurons in one layer) and the register qubits of the end neuron, wherein the unitary operation is configured to write its output as an input of the end neuron.

When the synapse connects a quantum neuron to a classical one, the measurement means already provide the suitable input for the classical neuron. When instead the synapse is between a classical neuron to a quantum neuron, the above unitary encoding function is

conveniently used without the application of the above controlling operations means.

These synapses can be realized as follows, referring to the simplified deep neural network schematically represented in Fig. 3, and its quantum circuit representation according to the invention, shown in Fig. 4.

In this exemplary quantum version, we suppose for simplicity that the input is fed in parallel to the two neurons of the hidden layer and that the input already is in the quantum form of an input wave function

After the neurons of the hidden layer the two ancillas of the hidden state carry the information on the two neurons' operations as the coefficient of their |l) state. The synapses work as follows: the measurement on the ancillas 1 and 2 collapses their state on |0) or |l) with probabilities depending on the hidden neurons results .

At the time of the measurement the two ancillae are entangled (via the controlled gate operations) with the register (single qubit in this case) of the output neuron .

These measurements can be hardware dependent. For example, in the case of superconducting quantum computing hardware they consists in measuring currents in superconducting qubits or microwave cavity phase shifts; in prospective photonic quantum processors, these measurements might consist in assessing photon numbers, or quadratures of the electromagnetic field or other properties like polarization or frequency of the photons; in current versions of trapped ion based quantum computers measurements consists in single shot read-out of absorbed signal photons conditioned on the presence of control photons, and so on.

The measurements on the different ancillae of the various neurons can be simultaneous or not. The measurements can be done at the output of the network. The measurement at the output neurons gives the result but the measurements at the other neurons provide the network with the necessary non linearity, because a measurement includes the modulus of a complex number, therefore some form of square operation. One can introduce further non-linearities, for example by using the universal approximation theorem mentioned in the following .

The measurement on ancillae 1 and 2 collapse the register of the output neuron on and this sets

the input for the output neuron. The plus or minus sign depend if the measurements on ancillae 1 and 2 yield 0 or 1 respectively. The output neuron uses two ancillae, one which is measured and the output of the measurement can be passed on for classical information processing (for instance, but not limited to, classical layers of a hybrid network) , the other which remains in quantum form and can be passed on for further quantum information processing. This last ancilla may not be necessary in some cases. If the output neurons have more than one qubit in their registers, each coefficient of their wave- functions can be set by synapses in the same way as here illustrated, using multiply controlled operations instead of simple CNOTs.

In classical computation, any memory bit can be turned on or off at will, requiring no prior knowledge or extra gadgetry. However, this is not the case in quantum computing or classical reversible computing. In these models of computing, all operations on computer memory must be reversible, and toggling a bit on or off would lose the information about the initial value of that bit. For this reason, in a quantum algorithm there is no way to deterministically put bits in a specific prescribed state unless one is given access to bits whose original state is known in advance. Such bits, whose values are known a priori, are known as ancilla bits in a quantum or reversible computing task.

A first use for ancilla bits is downgrading complicated quantum gates into simple gates. For example, by placing controls on ancilla bits, a Toffoli gate can be used as a controlled-NOT (CNOT) gate or a NOT gate. For classical reversible computation it is known that a single ancilla bit is necessary and sufficient for universal computation. Additional ancilla bits are not necessary, but the extra workspace can allow for simpler circuit constructions that use fewer gates. In quantum computing, quantum catalysis uses ancilla qubits to store entangled states that enable tasks that would not normally be possible with local operations and classical communication (LOCC) . Quantum computers also use ancilla bits for quantum error correction [2,3] .

According to an aspect of the present description, the ancilla qubit of each neuron of the preceding layer carries the information of the operation of the neuron in the form of its coefficients on a suitable basis. Through controlled operations between the ancilla and the qubits of the neurons in the successive layer, the coefficients of these qubits can be set, thus setting the input wave-function for the neurons of the new layer. This can be achieved by directly setting each coefficient of the neuron' s qubits on a suitable basis, or through more efficient means like for instance the Hypergraph States Method [1], or any other suitable or convenient method.

In case of a mixed network with quantum and classical artificial neurons, the ancilla is not present on the classical neurons, therefore between two classical neurons and from a classical neuron to a quantum neuron there is no ancilla measurement. Instead, from a quantum neuron to a classical one, the ancilla measurement result is the input value of the classical neuron.

Each neuron' s ancilla is then collapsed via measurement. This operation can be performed layer by layer, or as a batch at the end of the quantum deep neural network operation. The measurement process provides for the nonlinearity necessary for the training a neuron [1] . For the present invention, we have demonstrated that quantum neural networks of this kind have the necessary non-linearity thanks to the non- linearity of the neurons above, and that they can be trained by using back-propagation algorithms and gradient descent, but any other optimization method, for instance Newton-Raphson method or any other convenient learning procedure could be applied to this quantum network as well.

The output of the deep quantum neural network can be in classical form, i.e. the result of measurements on the ancilla qubits of the neurons of the last layer, to be used for data processing, or in quantum form, to be passed on to other quantum computation algorithms or transmitted via quantum channels.

According to a general aspect of the invention, a computer-implemented method for training an n artificial neural network, wherein the computer includes implemented on quantum computing hardware, comprises following steps:

A. a system of computing hardware configured to implement an artificial neural network artificial neural network as defined above;

B. Providing a set of training data for the artificial neural network;

C. Using the associated training algorithm to train the artificial neural network.

According to an aspect of the invention, the following further step is performed before step A:

A0. Mapping the neurons of the subset of classical artificial neurons into further quantum neurons which are included into said non-void set of quantum artificial neurons, so that all the neurons of said set of neurons are quantum artificial neurons; Wherein said associated training algorithm is a quantum function of all the neurons in said set of neurons.

According to another aspect of the invention, after step C, the following further step is executed:

D. Mapping at least one neuron of said set of neurons into corresponding at least a classical artificial neuron which is included into said subset of classical artificial neurons.

This forth and back mapping (e.g. as in Fig. 3 (a) to (b) , and vice versa due to the fact that the quantum operations are reversible) allows to use quantum networks for problems that needs a speed-up in training and (partially) classical network to apply the trained network .

The great advantage of quantum artificial neural networks with respect to neural networks implemented on classical computing hardware is that the quantum version combines the exponential advantage of quantum computing in both memory and parallel computation, with the nonlinearities typical of classical computing systems, that will act to provide the network the plasticity necessary for machine learning. Indeed we expect that networks with layers of quantum neurons each built with >30 qubits will be able to provide functionalities impossible even for the most performing GPU systems, and networks with quantum neurons built with >50 qubits will provide functionalities impossible even with supercomputers. Indeed, N = 50 qubits are able to store something like 9 x 10¹⁵ bytes of information (i.e., 9 Pb, assuming 8 bytes to store a complex number in single precision) , which roughly corresponds to the random access memory of state-of-art supercomputers. These advantages will come from the ability of our system to hold and process large quantities of data in parallel. Indeed, by encoding information as wave-functions as detailed above the systems of qubits and the operations implemented on the qubits act at the same time as both memory and processor, providing the exponential advantages described above for both. On the other hand, the use of measurement on the ancilla qubits of each neuron will provide to quantum computing protocols the nonlinearities that are impossible to achieve in purely quantum algorithms.

The capability of the quantum artificial (e.g. deep) neural network to output classical information makes these networks fully compatible with classical neural networks. When employing this invention with a few layers implemented on a quantum computer as specified in this patent, these layers can be seamlessly integrated in a larger classical neural network. The quantum layers can for instance be used, thanks to their exponential memory advantage with respect to classical layers, for highly memory and computational intensive tasks . They can for instance, but not limited to, be used to implement convolutional filters of very large patterns on extremely large input datasets. The fact that the quantum layers of the network can be trained using existing classical algorithms like, for instance, back- propagation means that the integration with the classical layers is seamless, and the whole neural network can be trained as a whole using existing well known training algorithms. To this extent we notice that, at the interface between a quantum layer and a following classical layer, additional nonlinear functions can be applied to the results of the measurement, to improve training .

In another embodiment, these networks can be realized in a fully quantum way, using only quantum neurons as described in [1]. Deep neural networks containing only quantum layers can be trained as before using classical optimization algorithms (like for instance back- propagation) but also using purely quantum algorithms, like for instance quantum search algorithms. This will provide an additional advantage with respect to classical neural networks, as quantum search algorithms provide at least polynomial advantages with respect to their classical counterparts. Indeed in quantum algorithms one can for instance input a set of training cases all at the same time in the form of

the superposition or any other

suitable quantum superposition including, if needed, additional register or ancilla qubits.

In another embodiment of the instant invention, the quantum artificial neural network can be used to implement arbitrary functions for quantum information processing. A well-known result for artificial neural network is the Universal Approximation Theorem. This theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons can approximate arbitrary continuous functions of the input. As a consequence, a quantum deep neural network with a single hidden layer built using the methods described here can be used to provide, as an output, qubits in a quantum state for which the coefficients over suitable basis are related to the coefficients of the input state via an arbitrary continuous function implemented at will. This will be extremely useful in quantum information processing algorithms, and quantum computing algorithms in particular, to provide nonlinear functionalities which will be otherwise difficult to obtain. One can, for example, use this system to obtain functions of the input as rectifier functions, for instance a Heaviside step function, that can be used as the output nonlinear function for a quantum neuron or be implemented in other quantum algorithms like search algorithms. The usefulness is however not limited to machine learning as the general ability of representing functions of the input can be of use for quantum simulation and quantum computation protocols in general.

With respect to prior art [8] by Kwok et al . , whereas they need a cumbersome linearization of classical operations to ape the operation of classical networks, the instant invention employs rotations of the whole wavefunction in the full Hilbert space to elaborate the information in a non-classical way, and it uses a small number of ancillae to extract the result of such elaboration .

In the instant invention, the networks cannot be trained by simply mimicking the training of a classical neural network; instead, the optimal parameters for such collective rotations are to be found, not the values of independent qubits or the values of rotations of independent qubits. The training of the present networks is thus fundamentally different than what done in classical machine learning and in [8] .

In the present method, information cannot simply be passed from one layer to the next by "passing" qubits. There is instead a need to set the values of the full Hilbert space of the subsequent layer. This is done via the original proposal to implement "quantum synapses", as above detailed.

Experiments

We have performed simulated experiments on the elementary deep neural network outlined in figure 3.

Here we refer to the numerical experiment performed to show the effectiveness of the quantum synapse invention. Fig. 3(b) refers to the specific simulation shown, where two artificial neurons (n1 and n2) are able to encode 4=2² dimensional input vectors (and are thus constitute by 2+1=3 qubits each, including the ancillae) , and the output layer is made of a single neuron in this case simply encoded into a single qubit whose activation represents the output of the whole network. In this particular case, the quantum synapses are realized by controlled-Z operations (blue and red) . Notice that in Fig. 3 (b) these are equivalently decomposed into a proper combination of CNOT and H operations . The nodes ni and 112 are encoded in parallel. After the operations of the first layer (except the measurement on the ancillae) have been performed, we can write the global state of the total ( 3+3+1 ) -qubit network as

Wherein are the coefficients of the

wavefunctions of the first and second neurons respectively after their operations, with the

activation probability of neuron n_x and the

non-activation probability for the same neuron and

contains, for each neuron, all the components other than the one leading to activation, see Eq. (7) above. Moreover, are the zero-state and 1-state of

ancilla a_x, respectively, and is the zero-state of

ancilla third neuron. In the meantime, the

qubit is brought into the superposition state by

applying a single-qubit Hadamard (H) gate. Synapses can thereafter be implemented with two CZ gates. The overall state of the QDNN finally becomes:

where represent the activation and rest states

of respectively. By defining as H on we

obtain an output state

We may observe at this point that the neurons of the hidden layer could also be measured in an activation state with probabilities:

However, as long as we are interested only in the output state of the network, we can in fact neglect the information contained in the variables pertaining to the hidden layer. This operation is per se what is commonly known as "tracing". Mathematically this is equivalent to a density matrix for the output neuron of the form

which automatically represents the convolution of the hidden nodes whereby

From this, it can be seen that the output neuron implicitly carries the result of the operations of the hidden layer, thanks to the fact that in the invention before the tracing operation the output ancilla in the invention is entangled with the register qubits of the hidden layer neurons. Thus, the result of the operation is output much more efficiently than in Kwok et al . [8], where information is carried by independent qubits.

In the experiment, the weights in the network have been chosen to distinguish whether the input figure contains lines, either vertical or horizontal. This is an impossible task for a single perceptron, since vertical and horizontal lines are orthogonal. One possible correct choice of weights is w₁ = 5 and w₁ = 3 on the provided legend, while two pixels of opposite value are needed for w₃. With reference to Fig. 6, the chosen inputs are all the possible black and white 2x2 pixels figures (the numbers on the x axis indicate figures as outlined in the legend) . The graph of Fig. 7 shows the measured output, and the output is larger than 0.5 (i.e. positive) . A simulated experiment realizing the training of the same quantum neural network has been performed using a backpropagation algorithm. The classical backpropagation equations were adapted to our quantum network. First we imposed the weights to be normalized, as it is required for quantum mechanical wavefunctions, and constrained the backpropagation algorithm to only span the space of normalized weights. Then we derived the backpropagation equations taking into account simply the quadratic nonlinearity that comes from the measurement process in quantum neurons, without the use of threshold or rectifying functions.

A series of positive and negative training cases were fed to the algorithm weighting a classically defined cost function. The figures in Fig. 8 clearly show that after a few thousand cases the algorithm manages to minimize the cost function, finding the correct weights w1, w2 and w3.

Application fields and further advantages

The present invention will be of use for many different applications. Quantum technologies are emerging as an important route to improve information and cognitive production tasks beyond the goals of "Industry 4.0". The ability to run artificial intelligence algorithms with capabilities surpassing existing machine learning programs thanks to quantum computing, this will benefit all productive activities using artificial intelligence. Industries using AI to elaborate large quantities of data will be heavily impacted by our innovation, due to the intrinsic capability of our product to scale exponentially better that standard neural networks with the dimension of the data to be elaborated. These industries include, but are not limited to, those operating in the field of smart cities and smart mobility, both heavily reliant on the efficient analysis of large quantities of data coming from grids of sensors. Healthcare and biomedical firms are also increasingly exploiting AI for information heavy tasks like drug discovery and DNA sequencing, and will be able to do so much more efficiently via the use of our quantum neural networks. The same can be said for financial services, where quantum AI can be efficiently applied for risk pricing and management. Finally it is important to notice that several competing technologies are being developed to reach the goal of universal quantum computing, including the superconductor based approach followed, by Google, Intel and IBM, photonics and trapped ions. Our innovations are however platform-independent, allowing our circuits to be implemented on any quantum computing machine.

Bibliography

[1] "An Artificial Neuron Implemented on an Actual Quantum Processor", Francesco Tacchino, Chiara Macchiavello, Dario Gerace, Daniele Bajoni, npj Quantum Information vol . 5, n. 26 (2019)

[2] Azuma, Koji; Koashi, Masato; Imoto, Nobuyuki (2008) . "Quantum catalysis of information". arXiv: 0804.2426 [quant-ph]

[3] Shor, Peter W. (1 October 1995) . "Scheme for reducing decoherence in quantum computer memory" . Physical Review A. 52 (4): R2493-R2496. Bibcode : 1995PhRvA..52.2493S . doi : 10.1103/PhysRevA.52. R2493. Retrieved 6 June 2015

[4] Lloyd, S., M. Mohseni, and P. Rebentrost. "Quantum Algorithms for Supervised and Unsupervised Machine Learning." ArXiv: 1307.0411 [Quant-Ph], July 1, 2013. htt : //arxi . org/abs/1307.0411.

[5] Nielsen, M. A., and I. L. Chuang. Quantum Computation and Quantum Information. Cambridge; New York: Cambridge University Press, 2000.

[6] Schuld, M., and Petruccione, F. Supervised Learning with Quantum Computers. Springer International Publishing, 2018. ISBN: 978-3-319-96423-2 [7] Schuld, M., M. Fingerhuth, and F. Petruccione.

"Implementing a Distance-Based Classifier with a Quantum Interference Circuit." EPL (Europhysics Letters) 119, no. 6 (September 1, 2017) : 60002. https : / /doi . org/10.1209/0295-5075/119/60002.

[8] KWOK HO WAN ET AL : "Quantum generalisation of feedforward neural networks", NPJ QUANTUM INFORMATION, vol. 3, no. 1, 14 September 2017 (2017-09-14),

XP055670221, D01: 10.1038/s41534-017-0032-4.

In the foregoing, the preferred embodiments have been described and variations of the present invention have been suggested, but it is to be understood that those skilled in the art will be able to make modifications and changes without thereby departing from the relevant scope of protection, as defined by the attached claims.

Claims

1) System of computing hardware configured to implement an artificial neural network, wherein the artificial neural network comprises a plurality of layers including:

— An input layer comprising one or more respective m- dimensional input data vectors, with m a respective positive integer;

— A set of neurons:

each comprising a respective neuron register of data including a set of m coefficients;

— A set of connections between the neurons;

— A respective unitary weight function configured to

act on the coefficients of the neuron register; and

— An associated training algorithm configured to change the respective unitary weight function depending

on the output data vectors;

The system being characterized in that said set of neurons comprises:

— A non-void subset of quantum artificial neurons implemented on quantum computing hardware; and

— A subset of classical artificial neurons implemented on a classical computing hardware; Wherein in the non-void subset of quantum artificial neurons :

— The respective neuron register of data is a respective set on N neuron register qubits representing said set of m coefficients as coefficients of a superposition of m basis quantum states, wherein N is derived from the relationship m = 2^N;

— If a neuron receives input from a classical artificial neuron or from data of the input data vector, the neuron comprises a respective unitary encoding function configured to encode the data into the

respective register qubits;

— The respective unitary weight function is

configured to act on the coefficients of the respective register qubits; and

— Each neuron includes one or more respective ancilla qubits;

Wherein one or more connections of said set of connections are quantum artificial synapses starting from respective elements of said non-void subset of quantum artificial neurons and ending on an end neuron belonging to said set of neurons in a successive layer, each of the one or more connections comprising:

— At least a respective measurement means for measuring said respective one or more ancilla qubits of the respective elements;

— If said end neuron belongs to said non-void subset of quantum neurons, respective controlling operations means configured to apply, upstream the respective measurement means, a unitary operation on the one or more respective ancilla qubits of the respective elements of said non-void subset and the register qubits of the end neuron, wherein the unitary operation is configured to write its output as an input of the end neuron.

2) System according to claim 1, wherein the one or more m-dimensional input data vectors include classical data and/or quantum data.

3) System according to one or more claims 1-2, wherein the one or more m-dimensional input data vectors is an only input data vector, fed to each neuron of the first hidden layer in parallel or fed in split portions to different neurons of the first hidden layer.

4) System according to one or more claims 1-3, wherein said m basis quantum states are the same for all quantum artificial neurons.

5) System according to one or more claims 1-4, wherein said subset of classical artificial neurons is void.

6) System according to one or more claims 1-4, wherein the subset of quantum artificial neurons constitutes one or more complete layers in the plurality of layers, the remaining layers being layers of classical neurons. 7) System according to claim 6, wherein at the interface between a quantum layer and a following classical layer, nonlinear functions are implemented downstream the respective measurement means.

8) Computer-implemented method for training an artificial neural network, wherein the computer includes quantum computing hardware, wherein the following steps are performed:

A. Providing a system of computing hardware configured to implement an artificial neural network as defined in one or more claims 1 to 7;

B. Providing a set of training data for the artificial neural network;

C. Using said associated training algorithm to train the artificial neural network.

9) Computer-implemented method according to claim 8, wherein said associated training algorithm is a classical function of all the neurons in said set of neurons .

10) Computer-implemented method according to claim 8, wherein the following further step is performed before step A:

11) Computer-implemented method according to claim 10, wherein, after step C, the following further step is executed: