CN116089674A - Quantum computation-based data clustering method, device, equipment and storage medium - Google Patents

Quantum computation-based data clustering method, device, equipment and storage medium Download PDF

Info

Publication number
CN116089674A
CN116089674A CN202111268632.2A CN202111268632A CN116089674A CN 116089674 A CN116089674 A CN 116089674A CN 202111268632 A CN202111268632 A CN 202111268632A CN 116089674 A CN116089674 A CN 116089674A
Authority
CN
China
Prior art keywords
data
preset
quantum
data points
data point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111268632.2A
Other languages
Chinese (zh)
Inventor
窦猛汉
王伟
李蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Origin Quantum Computing Technology Co Ltd
Original Assignee
Origin Quantum Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Origin Quantum Computing Technology Co Ltd filed Critical Origin Quantum Computing Technology Co Ltd
Priority to CN202111268632.2A priority Critical patent/CN116089674A/en
Publication of CN116089674A publication Critical patent/CN116089674A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N10/00Quantum computing, i.e. information processing based on quantum-mechanical phenomena
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a data clustering method, a device, equipment and a storage medium based on quantum computing. The method comprises the following steps: acquiring a first data point in a first set, wherein the first set comprises a plurality of data points, judging whether the number of other data points in a preset neighborhood of the first data point is larger than a preset threshold value according to a preset quantum circuit, if so, marking the first data point as a core point, and marking other data points in the preset neighborhood of the first data point into a first cluster taking the first data point as a core. Therefore, the data with higher similarity are divided into the same cluster based on quantum computation and a clustering algorithm.

Description

Quantum computation-based data clustering method, device, equipment and storage medium
Technical Field
The invention belongs to the field of quantum computing, and particularly relates to a data clustering method, device, equipment and storage medium based on quantum computing.
Background
Two cores of quantum computing are physical quantum computers and quantum algorithms. Under the development of quantum computer hardware, many quantum algorithms are in a theoretical derivation stage. The simulation of the classical computation on the quantum algorithm is also limited to a low-qubit quantum circuit, taking a full-amplitude algorithm as an example, when the number of qubits of the quantum circuit reaches 32, if the classical computation simulation is adopted, at least 64G memory is needed; simulating a quantum circuit of 33 qubits requires 128G of memory, and this exponentially increasing memory requirement severely limits the simulation effect of classical computation on quantum algorithms.
The core object of the current cluster is updated by using a quantum algorithm, the data belonging to the same class is found out through the distance value, and the clustering of the data is completed, so that the corresponding quantum algorithm is needed to realize the clustering process, and the technical blank is filled.
Disclosure of Invention
The data clustering method, device, equipment and storage medium based on quantum computation provided by the application exert the parallel advantage of quantum computation, and divide data with higher similarity into the same cluster based on quantum computation and a clustering algorithm.
In a first aspect, the present application provides a data clustering method based on quantum computation, including:
obtaining a first data point in a first set, the first set comprising a plurality of data points;
judging whether the number of other data points in a preset neighborhood of the first data point is larger than a preset threshold value or not according to a preset quantum circuit;
if the data points are larger than the first data points, marking the first data points as core points, and marking other data points in a preset neighborhood of the first data points into a first cluster taking the first data points as cores.
Optionally, if the number of other data points in the preset neighborhood of the first data point is not greater than a preset threshold, the method further includes:
the first data point is marked as a non-core point, which is not the core of any cluster.
Optionally, after the marking other data points in the preset neighborhood of the first data point into the first cluster with the first data point as a core, the method further includes:
acquiring a second data point in the first set, the second data point being located at a different location than the first data point;
judging whether the number of other data points in the preset neighborhood of the second data point is larger than a preset threshold value or not;
if the data points are larger than the first data points, setting the second data points as core points, and dividing other data points in a preset neighborhood of the second data points into a second cluster taking the second data points as cores;
judging whether any data point in the second cluster is already divided into a first cluster taking the first data point as a core;
if yes, the first data point is taken as a core, and the first cluster and the second cluster are combined.
Optionally, the determining whether the number of other data points in the preset neighborhood of the first data point is greater than a preset threshold value includes:
calculating the similarity between the first data point and other data points in the first set according to a preset quantum circuit;
judging whether the number of the first similarity larger than or equal to the preset similarity is larger than a preset threshold value or not;
if yes, the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value;
if not, the number of other data points in the preset neighborhood of the first data point is not greater than a preset threshold value.
Optionally, calculating the similarity between the first data point and other data points in the first set one by one according to a preset quantum circuit;
constructing the preset quantum circuit according to preset quantum logic gates, wherein the preset quantum logic gates comprise an RX gate, a RY gate, an H gate and a controlled SWAP gate;
preparing data points in the first set into quantum states respectively;
preparing the quantum state of the first data point and the quantum states of other data points in the first set onto the quantum circuit, and operating the quantum circuit;
and measuring a target quantum bit of the quantum circuit, and obtaining the similarity between the first data point and other data points in the first set according to the measurement result of the target quantum bit.
Optionally, the determining whether the number of the first similarities greater than or equal to the preset similarity is greater than a preset threshold value includes;
mapping a first similarity greater than or equal to the preset similarity to a first target value;
searching the number of the first similarity corresponding to the first target value according to a preset quantum search algorithm;
if the number of the first similarities corresponding to the first target value is larger than a preset threshold value, the number of the first similarities larger than or equal to the preset similarity is larger than the preset threshold value;
if the number of the first similarities corresponding to the first target value is not greater than the preset threshold, the number of the first similarities greater than or equal to the preset similarity is not greater than the preset threshold.
In a second aspect, the present application provides a data clustering device based on quantum computation, including:
an acquisition unit configured to acquire a first data point in a first set, the first set including a plurality of data points;
the judging unit is used for judging whether the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value or not;
and the marking unit is used for marking the first data point as a core point and marking other data points in the preset neighborhood of the first data point into a first cluster taking the first data point as a core if the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for performing steps in the method described in the first aspect of the embodiment of the present application.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program causes a computer to perform some or all of the steps described in the method according to the first aspect of the embodiments of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product, wherein the computer program product comprises a non-transitory computer readable storage medium storing a computer program, the computer program being operable to cause a computer to perform some or all of the steps described in the method according to the first aspect of the embodiments of the present application. The computer program product may be a software installation package.
In a sixth aspect, embodiments of the present application provide a quantum computer operating system, where the quantum computer operating system implements a process based on quantum computing data clustering according to some or all of the steps described in the method according to the first aspect of the embodiments of the present application.
It can be seen that a first data point in a first set is obtained, the first set includes a plurality of data points, according to a preset quantum circuit, whether the number of other data points in a preset neighborhood of the first data point is larger than a preset threshold value is judged, if so, the first data point is marked as a core point, and other data points in the preset neighborhood of the first data point are marked into a first cluster taking the first data point as a core. By adopting the embodiment of the application, the data with higher similarity can be divided into the same cluster based on quantum computation and a clustering algorithm.
Drawings
FIG. 1 is a schematic diagram of a flow of a data clustering method based on quantum computation according to an embodiment of the present application;
FIG. 2 is another schematic diagram of a flow of a data clustering method based on quantum computing according to an embodiment of the present application;
FIG. 3 is another schematic diagram of a flow of a data clustering method based on quantum computing according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a quantum circuit for preparing a quantum state according to an embodiment of the present application;
fig. 5 is a schematic diagram of a quantum circuit for calculating similarity according to an embodiment of the present application;
fig. 6 is a schematic diagram of a modularized quantum circuit corresponding to a Grover algorithm provided in the present application;
FIG. 7 is an iterative schematic of a Grover algorithm provided herein;
fig. 8 is a schematic structural diagram of a data clustering device based on quantum computation according to an embodiment of the present application;
fig. 9 is a hardware structure block diagram of a computer terminal of a data clustering method based on quantum computation according to an embodiment of the present application.
Detailed Description
The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
The data clustering method, device, equipment and storage medium based on quantum computation provided by the application exert the parallel advantage of quantum computation, and divide data with higher similarity into the same cluster based on quantum computation and a clustering algorithm.
It should be noted that, the quantum program referred to in the embodiments of the present application is a program written in a classical language to characterize qubits and their evolution, where qubits, quantum logic gates, and the like related to quantum computing are all represented by corresponding classical codes.
Quantum circuits, which are one embodiment of quantum programs, also weigh sub-logic circuits, are the most commonly used general quantum computing models, representing circuits that operate on qubits under an abstract concept, the composition of which includes qubits, circuits (timelines), and various quantum logic gates, and finally the results often need to be read out by quantum measurement operations. The quantum circuit may be presented in a sequence of quantum logic gates arranged in a certain execution timing sequence.
Unlike conventional circuits that are connected by metal lines to pass voltage or current signals, in quantum circuits, the circuits can be seen as being connected by time, i.e., the state of the qubit naturally evolves over time, as indicated by the hamiltonian operator, during which the circuit is operated until the quantum logic gate is encountered.
A quantum program is generally corresponding to a total quantum circuit, where the quantum program refers to the total quantum circuit, and the total number of qubits in the total quantum circuit is the same as the total number of qubits in the quantum program. It can be understood that: one quantum program may consist of a quantum circuit, a measurement operation for the quantum bits in the quantum circuit, a register to hold the measurement results, and a control flow node (jump instruction), and one quantum circuit may contain several tens to hundreds or even thousands of quantum logic gate operations. The execution process of the quantum program is a process of executing all quantum logic gates according to a certain time sequence. Note that the timing is the time sequence in which a single quantum logic gate is executed.
It should be noted that in classical computation, the most basic unit is a bit, and the most basic control mode is a logic gate, and the purpose of the control circuit can be achieved by a combination of logic gates. Similarly, the way in which the qubits are handled is a quantum logic gate. Quantum logic gates are used, which are the basis for forming quantum lines, and include single-bit quantum logic gates (or single-quantum logic gates, abbreviated as "single gates"), such as Hadamard gates (H gate, ada Ma Men), bery-X gates (X gate), bery-Y gates (Y gate), bery-Z gates (Z gate), RX gates, RY gates, RZ gates, and the like; two-bit quantum logic gates (or double quantum logic gates, simply "double gates"), such as CNOT gates, CR gates, SWAP gates, ISWAP gates, and the like; multi-bit quantum logic gates (or multi-quantum logic gates, simply "multi-gates"), such as Toffoli gates, and the like. Quantum logic gates are typically represented using unitary matrices, which are not only in matrix form, but also an operation and transformation. The effect of a general quantum logic gate on a quantum state is calculated by multiplying the unitary matrix by the matrix corresponding to the right vector of the quantum state. For example, the quantum state right vector |0>The corresponding vector is
Figure BDA0003327391450000051
Quantum state right vector |1>The corresponding vector is +.>
Figure BDA0003327391450000061
Quantum states, i.e., the logical states of a qubit. In the quantum algorithm (or weighing subroutine), for the quantum states of a group of quantum bits contained in the quantum circuit, a binary expression mode is adopted, for example, the group of quantum bits are q0, q1 and q2, the 0 th, 1 st and 2 nd quantum bits are represented, the q2q1q0 are ordered from high order to low order in the binary expression mode, the quantum states corresponding to the group of quantum bits are in total number of 2 quantum bits to the power of the total number of the quantum bits, namely 8 eigenstates (determined states): i000>、|001>、|010>、|011>、|100>、|101>、|110>、|111>The bits of each quantum state correspond to the qubits, e.g. |001>In the state, 001 corresponds to q2q1q0, |from high to low>Is a dirac symbol. For a bit q containing N quanta 0 、q 1 、…、q n 、…、q N-1 The order of the binary representation quantum states is q N-1 q N-2 …、q 1 q 0
Described in terms of a single qubit, the logic state ψ of a single qubit may be at |0>State, |1>State, |0>State sum |1>The superimposed state (uncertainty state) of the states can be expressed in particular as ψ=a|0>+b|1>Where a and b are complex numbers representing the amplitude (probability amplitude) of the quantum states, the square of the modulus of the amplitude represents the probability, |a| 2 、|b| 2 Respectively indicate that the logic state is |0>State, |1>Probability of state, |a| 2 +|b| 2 =1. In short, a quantum state is an superposition of eigenstates, when the probability of the other states is 0, i.e. in a uniquely defined eigenstate.
Clustering algorithms are machine learning techniques that involve grouping of data points, and given a set of data points, one can use the clustering algorithm to divide each data point into a particular cluster. In some cases, data points in the same group should have similar attributes and/or features, while data points in different groups should have highly different attributes and/or features, further, when grouping clusters for a group of data points, more densely distributed points may also be grouped into the same cluster.
Referring to fig. 1, a schematic flow diagram of a data clustering method based on quantum computation provided in an embodiment of the present application includes:
101. obtaining a first data point in a first set, the first set comprising a plurality of data points;
in this embodiment, the first set includes a plurality of data points, each data point corresponds to a coordinate position, that is, the first set includes a plurality of data points representing coordinate positions, and the present application aims to divide data points in a more concentrated distribution into the same area, that is, clusters.
102. Judging whether the number of other data points in a preset neighborhood of the first data point is larger than a preset threshold value or not according to a preset quantum circuit;
103. if the data points are larger than the first data points, marking the first data points as core points, and marking other data points in a preset neighborhood of the first data points into a first cluster taking the first data points as cores.
In this embodiment, the first set includes a plurality of data points representing coordinate positions, and the first set may be regarded as a plane, where the plurality of points are irregularly distributed on the plane, and the points with relatively concentrated distribution are divided into the same cluster.
Specifically, if a certain data point, namely, a preset neighborhood of a first data point, includes a certain number of data points, the data points distributed in the preset neighborhood can be considered to be relatively concentrated, the first data point is taken as a central point, a preset neighborhood formed by taking a preset distance as a radius is taken as a cluster, in the implementation process, whether the number of the data points, of which the distances between all the data points in the first set and the first data point are smaller than the preset distance, is larger than a preset threshold value is judged, if the number is larger than the preset threshold value, the first data point is marked as a core point, other data points in the preset neighborhood of the first data point are marked into a first cluster taking the first data point as a core, and if the number is not larger than the preset threshold value, the first data point is marked as a non-core point, and the non-core point is not used as the core of any cluster.
In this embodiment, a first data point in a first set is obtained, where the first set includes a plurality of data points, according to a preset quantum circuit, whether the number of other data points in a preset neighborhood of the first data point is greater than a preset threshold value is determined, if so, the first data point is marked as a core point, and other data points in the preset neighborhood of the first data point are marked into a first cluster using the first data point as a core. Therefore, the data with higher similarity are divided into the same cluster based on quantum computation and a clustering algorithm.
Based on fig. 1, the embodiment of the present application further introduces a cluster merging situation, specifically referring to fig. 2, another flow chart of a data clustering method based on quantum computation provided in the embodiment of the present application includes:
201. acquiring a second data point in the first set, the second data point being located at a different location than the first data point;
202. judging whether the number of other data points in the preset neighborhood of the second data point is larger than a preset threshold value or not;
203. if the data points are larger than the first data points, setting the second data points as core points, and dividing other data points in a preset neighborhood of the second data points into a second cluster taking the second data points as cores;
204. judging whether any data point in the second cluster is already divided into a first cluster taking the first data point as a core;
205. if yes, the first data point is taken as a core, and the first cluster and the second cluster are combined.
In this embodiment, since the data points included in the preset neighborhood of a certain data point are limited, the concentration degree of the data point distribution outside the preset neighborhood still may still meet the requirement of being able to be divided into the same cluster, and then the data points outside the preset neighborhood may be integrated into the preset neighborhood to generate a new cluster.
Specifically, the present embodiment performs the method as described in the previous embodiments 101-103 once for all data points included in the first set until all points are marked as core points or non-core points. And after the first data point is marked as a core point, acquiring a second data point in the first set, wherein the second data point is different from the first data point in position and is not marked, judging whether the number of other data points in a preset neighborhood of the second data point is larger than a preset threshold value, if not, marking the second data point as a non-core point, continuing to traverse other unmarked points, and if so, taking the second data point as a core, and dividing a circular area with a preset distance as a radius into second clusters. If the second cluster is crossed with the data points included in the first cluster, namely, the points in a certain first set are located in the areas of the first cluster and the second cluster at the same time, the second cluster is combined with the first cluster, the core point of the previous generated cluster is used as a new core point, or the coordinates of all the points included in the new cluster are calculated to perform weighted average calculation so as to obtain a new coordinate position, and the data point closest to the coordinate position is used as the new core point, so that the application does not require.
Referring to fig. 1, the process of calculating the similarity and counting the number of data points is further described, and referring specifically to fig. 3, another flow chart of the data clustering method based on quantum computation provided in the embodiment of the present application includes:
301. calculating the similarity between the first data point and other data points in the first set according to a preset quantum circuit;
in this embodiment, the similarity of coordinate data of different data points is calculated according to the quantum technology, and whether the distance between the two points meets the condition that the same cluster can be marked is obtained according to the similarity, if the similarity is greater than a preset minimum value, the distance between the two points is considered to be smaller than the preset distance.
Specifically, the information of the point is prepared into a quantum state according to the coordinate data of the point, and the process is as follows:
as shown in FIG. 4, the two-dimensional data is quantum-processed using quantum logic gates RX and RY gates, e.g., twoThe coordinates of the data points are respectively
Figure BDA0003327391450000091
The two data points can be generally selected as a data point to be clustered and a cluster center, and the rotation angle parameters of the logic gates RX and RY are determined as follows:
Figure BDA0003327391450000092
Figure BDA0003327391450000093
Figure BDA0003327391450000094
Figure BDA0003327391450000095
wherein θ 00 According to x 0 The angle is expressed to obtain theta 01 According to y 0 The angle is expressed to obtain theta 10 According to x 1 The angle is expressed to obtain theta 11 According to y 1 And performing angle representation.
In this embodiment, quantum state processing is performed according to the quantum logic gates RX and RY gates, and then similarity calculation is performed through the controlled SWAP gate, where a quantum circuit for performing similarity calculation is shown in fig. 5, another quantum circuit schematic diagram of a data clustering method based on quantum calculation in this application is shown, fig. 5 is a quantum logic gate operation after fig. 4, fig. 5 includes an H gate, the controlled SWAP gate, and a measurement operation M, where the H gate is used to place the quantum state prepared in fig. 4 in an overlapped state, the controlled SWAP gate is used to calculate the similarity between q-1 and q-2 and transfer the similarity to q-0, and the measurement operation M is used to measure the quantum state of q-0.
Specifically, the coordinate position of the cluster center is taken as the origin,the coordinates of the data to be clustered are represented by a vector u pointing to the data to be clustered from a cluster center, and are unitized according to a formula 5 to facilitate calculation, and the entanglement state is defined according to a formula 6
Figure BDA0003327391450000096
Defining entanglement states according to equation 7>
Figure BDA0003327391450000097
Defining the normalized coefficient Z according to equation 8, the similarity D i As shown in the formula 9, the following is specific: />
Figure BDA0003327391450000098
Figure BDA0003327391450000101
Figure BDA0003327391450000102
Figure BDA0003327391450000103
Figure BDA0003327391450000104
Wherein u= (u) 0 ,u 1 ,...,u n ),
Figure BDA0003327391450000105
And m is the total number of data to be aggregated, and is the j-th vector of the c-th cluster.
Further, the method comprises the steps of,
Figure BDA0003327391450000106
and |phi>Entangled state can be obtained after controlled SWAP gate operationResults D as follows:
Figure BDA0003327391450000107
the probability of q-0 to get |0> is measured as:
Figure BDA0003327391450000108
then it can be derived from equation 11 and equation 9:
D i =2P(|0>) -1 equation 12
From the derivation of equation 12 above, the similarity can be obtained by measuring the quantum state of the q-0 qubit.
302. Judging whether the number of the first similarity larger than or equal to the preset similarity is larger than a preset threshold value.
In this embodiment, the first similarity obtained according to the method of embodiment 301 is compared with a preset minimum similarity, so as to obtain the number of first similarities greater than or equal to the preset minimum similarity, where the number of first similarities satisfying the condition is other data points included in the preset neighborhood of the first data point.
Specifically, if 8 points are added to the first set except for the first data point and are numbered with 0-7, if the points corresponding to 4, 5, 6 and 7 meet the conditions, since eight points are added, the corresponding binary is 111, 3 qubits are needed to be encoded into the quantum state, and 3 bits together represent 3 times of data (0 to 7, corresponding to |000> to |111> states) of 2, 8 probability data are output in total.
Then |000>: output |0>, |001>: output |0>, |010>: output |0>, |011>: output |0>, |100>: output |1>, |101>: output |1>, |110>: output |1>, |111 >. Output |1>; thereby obtaining the probability of 0 that each quantum state included in the superposition state phi is larger than the target value 0, 1. Since the elements are 7 at maximum, binary 111, 3 qubits are required to be encoded into the quantum states, and 3 bits together represent 3 data (0 to 7, corresponding to |000> to |111> states) of 2, 8 probability data are output in total.
The first set corresponds to a first superposition state |φ >:
Figure BDA0003327391450000111
quantum states |000>, |001>, |010>, |011> with amplitude 0, 1, 2, 3, the indexes of quantum states |100>, |101>, |110>, |111> with the amplitude of 1 respectively correspond to the indexes of 4, 5, 6 and 7, and the indexes of the indexes are 0, 1 and 1. And finding out an index value corresponding to the probability 1 according to the 8 probability values, and further finding out element values 4, 5, 6 and 7 corresponding to the index value, namely the data points included in the preset neighborhood.
Taking the first set as an example, if the result quantum state |000 is output>To |111>And the probability of it being within a preset neighborhood. A probability of 0 indicates that the data point is not within the preset neighborhood, and a probability of 1 indicates that the data point is within the preset neighborhood. From the following components
Figure BDA0003327391450000114
That is, f (0) =0, f (1) =0, f (2) =0, f (3) =0, f (4) =1, f (5) =1, f (6) =1, f (7) =1, and only the index x corresponding to f (x) =1 needs to be found, and the corresponding element can be found according to the index.
First, a second superposition state |ψ > is created as shown in equation 13:
Figure BDA0003327391450000112
wherein N is the number of probability values output. Taking set a as an example, n=8.
Setting the first Oracle operator o, where the operator is used for the corresponding quantum state inversion phase when f (x) =1, as shown in formula 14:
Figure BDA0003327391450000113
a second operator G (Grover operator) is defined for expanding the amplitude of the quantum state of the inversion phase as shown in equation 15:
g= (2|ψ > < ψ| -I) O equation 15
Wherein O is
Figure BDA0003327391450000121
Without loss of generality, all x-constituent quantum states of f (x) =1 are assumed to be as shown in equation 16:
Figure BDA0003327391450000122
then, the quantum state composed of all x of f (x) =0 is as shown in formula 17:
Figure BDA0003327391450000123
where M represents the number of solutions in the set, |α > represents the quantum superposition of all non-solutions, |β > represents the quantum superposition of all solutions, i.e., the final quantum state.
Wherein n=2 n . Therefore |ψ>Can be represented by equation 18:
Figure BDA0003327391450000124
the Grover algorithm is applied using equation 19:
o (a|α > +b|β >) =a|α > -b|β > equation 19
For simple calculation, set up
Figure BDA0003327391450000125
It can be obtained that |ψ > after the second operator G acts once is as shown in formula 20:
Figure BDA0003327391450000126
further, we get |ψ > after the second operator G acts k times as shown in equation 21:
Figure BDA0003327391450000127
the usable image is shown in fig. 7, and the use of G multiple times can allow the |ψ > to be continuously close to the |β >. Finally, the measurement |ψ > can be performed to obtain an index of |β >, i.e., f (x) =1, with a high probability. Illustratively, a modular quantum circuit diagram of a Grover algorithm is shown in fig. 6, as will be appreciated by those skilled in the art,
Figure BDA0003327391450000128
representing a quantum logic gate module (including an H gate) creating an overlaid state, the Oracle workspace corresponds to a first Oracle operator o and G corresponds to a second Grover operator.
In this embodiment, a first data point in a first set is obtained, where the first set includes a plurality of data points, according to a preset quantum circuit, whether the number of other data points in a preset neighborhood of the first data point is greater than a preset threshold value is determined, if so, the first data point is marked as a core point, and other data points in the preset neighborhood of the first data point are marked into a first cluster using the first data point as a core. Therefore, the data with higher similarity are divided into the same cluster based on quantum computation and a clustering algorithm.
The foregoing describes the present invention from a method perspective, and the following further describes the present invention from a device perspective, with particular reference to fig. 8, including:
an acquisition unit 801 for acquiring a first data point in a first set, the first set comprising a plurality of data points;
a judging unit 802, configured to judge whether the number of other data points in the preset neighborhood of the first data point is greater than a preset threshold;
and a marking unit 803, configured to mark the first data point as a core point if the number of other data points in the preset neighborhood of the first data point is greater than a preset threshold, and mark other data points in the preset neighborhood of the first data point into a first cluster using the first data point as a core.
It can be seen that, the obtaining unit 801 is configured to obtain a first data point in a first set, where the first set includes a plurality of data points, the judging unit 802 is configured to judge whether the number of other data points in a preset neighborhood of the first data point is greater than a preset threshold, and the marking unit 803 is configured to mark the first data point as a core point and mark other data points in the preset neighborhood of the first data point into a first cluster using the first data point as a core if the number of other data points in the preset neighborhood of the first data point is greater than the preset threshold. The data with higher similarity is divided into the same cluster based on quantum computation and a clustering algorithm.
The following describes the operation of the computer terminal in detail by taking it as an example. Fig. 9 is a hardware structure block diagram of a computer terminal of a data clustering method based on quantum computation according to an embodiment of the present invention. As shown in fig. 9, the computer terminal may include one or more (only one is shown in fig. 9) processors 901 (the processor 901 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 902 for storing data, and optionally, a transmission device 903 for communication functions and an input-output device 904. It will be appreciated by those skilled in the art that the configuration shown in fig. 9 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in fig. 9, or have a different configuration than shown in fig. 9.
The memory 902 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the data clustering method based on quantum computation in the embodiments of the present application, and the processor 901 executes the software programs and modules stored in the memory 902, thereby performing various functional applications and data processing, that is, implementing the above-mentioned method. The memory 902 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 902 may further comprise memory remotely located relative to the processor 901, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 903 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission means 903 comprises a network adapter (Network Interface Controller, NIC) which can be connected to other network devices via a base station so as to communicate with the internet. In one example, the transmission device 903 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner. The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program for electronic data exchange, and the computer program makes a computer execute part or all of the steps of any one of the above method embodiments, and the computer includes an electronic device.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the methods described in the method embodiments above. The computer program product may be a software installation package, said computer comprising an electronic device.
The embodiment of the application also provides a quantum computer operating system which realizes the data clustering processing based on quantum computing according to part or all of the steps of any one of the methods described in the embodiment of the method.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, such as the above-described division of units, merely a division of logic functions, and there may be additional manners of dividing in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a memory, including several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
The foregoing has outlined rather broadly the more detailed description of embodiments of the present application, wherein specific examples are provided herein to illustrate the principles and embodiments of the present application, the above examples being provided solely to assist in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A data clustering method based on quantum computation, the method comprising:
obtaining a first data point in a first set, the first set comprising a plurality of data points;
judging whether the number of other data points in a preset neighborhood of the first data point is larger than a preset threshold value or not according to a preset quantum circuit;
if the data points are larger than the first data points, marking the first data points as core points, and marking other data points in a preset neighborhood of the first data points into a first cluster taking the first data points as cores.
2. The method of claim 1, wherein if the number of other data points in the predetermined neighborhood of the first data point is not greater than a predetermined threshold, the method further comprises:
the first data point is marked as a non-core point, which is not the core of any cluster.
3. The method of claim 1, wherein after the grouping other data points in the predetermined neighborhood of the first data point into the first cluster with the first data point as a core, the method further comprises:
acquiring a second data point in the first set, the second data point being located at a different location than the first data point;
judging whether the number of other data points in the preset neighborhood of the second data point is larger than a preset threshold value or not;
if the data points are larger than the first data points, setting the second data points as core points, and dividing other data points in a preset neighborhood of the second data points into a second cluster taking the second data points as cores;
judging whether any data point in the second cluster is already divided into a first cluster taking the first data point as a core;
if yes, the first data point is taken as a core, and the first cluster and the second cluster are combined.
4. The method of claim 1, wherein determining whether the number of other data points in the predetermined neighborhood of the first data point is greater than a predetermined threshold comprises:
calculating the similarity between the first data point and other data points in the first set according to a preset quantum circuit;
judging whether the number of the first similarity larger than or equal to the preset similarity is larger than a preset threshold value or not;
if yes, the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value;
if not, the number of other data points in the preset neighborhood of the first data point is not greater than a preset threshold value.
5. The method of claim 4, wherein the similarity between the first data point and other data points in the first set is calculated one by one according to a preset sub-line;
constructing the preset quantum circuit according to preset quantum logic gates, wherein the preset quantum logic gates comprise an RX gate, a RY gate, an H gate and a controlled SWAP gate;
preparing data points in the first set into quantum states respectively;
preparing the quantum state of the first data point and the quantum states of other data points in the first set onto the quantum circuit, and operating the quantum circuit;
and measuring a target quantum bit of the quantum circuit, and obtaining the similarity between the first data point and other data points in the first set according to the measurement result of the target quantum bit.
6. The method of claim 4, wherein determining whether the number of first similarities greater than or equal to a preset similarity is greater than a preset threshold comprises;
mapping a first similarity greater than or equal to the preset similarity to a first target value;
searching the number of the first similarity corresponding to the first target value according to a preset quantum search algorithm;
if the number of the first similarities corresponding to the first target value is larger than a preset threshold value, the number of the first similarities larger than or equal to the preset similarity is larger than the preset threshold value;
if the number of the first similarities corresponding to the first target value is not greater than the preset threshold, the number of the first similarities greater than or equal to the preset similarity is not greater than the preset threshold.
7. A quantum computation-based data clustering device, the device comprising:
an acquisition unit configured to acquire a first data point in a first set, the first set including a plurality of data points;
the judging unit is used for judging whether the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value or not;
and the marking unit is used for marking the first data point as a core point and marking other data points in the preset neighborhood of the first data point into a first cluster taking the first data point as a core if the number of other data points in the preset neighborhood of the first data point is larger than a preset threshold value.
8. An electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-6.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any of claims 1-6.
10. A quantum computer operating system implementing quantum computing based data clustering according to the method of any one of claims 1-6.
CN202111268632.2A 2021-10-29 2021-10-29 Quantum computation-based data clustering method, device, equipment and storage medium Pending CN116089674A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111268632.2A CN116089674A (en) 2021-10-29 2021-10-29 Quantum computation-based data clustering method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111268632.2A CN116089674A (en) 2021-10-29 2021-10-29 Quantum computation-based data clustering method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116089674A true CN116089674A (en) 2023-05-09

Family

ID=86187331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111268632.2A Pending CN116089674A (en) 2021-10-29 2021-10-29 Quantum computation-based data clustering method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116089674A (en)

Similar Documents

Publication Publication Date Title
CN113222155B (en) Quantum circuit construction method and device, electronic device and storage medium
WO2022267854A1 (en) Method, system and apparatus for processing quantum computing task, and operating system
CN115618663B (en) Quantum solving method and device for coupling grid equation and physical equation
CN113222153A (en) Quantum state simulation method and device, storage medium and electronic device
CN114511094A (en) Quantum algorithm optimization method and device, storage medium and electronic device
CN113222157B (en) Quantum simulation method, quantum simulation device, electronic device and storage medium
CN114764619A (en) Convolution operation method and device based on quantum circuit
WO2023221680A1 (en) Quantum state preparation circuit generation and quantum state preparation methods and apparatuses, and quantum chip
CN115879562B (en) Quantum program initial mapping determination method and device and quantum computer
CN116415958B (en) Abnormal data detection method and device based on quantum technology and storage medium
CN114219048A (en) Spectral clustering method and device based on quantum computation, electronic equipment and storage medium
CN116089674A (en) Quantum computation-based data clustering method, device, equipment and storage medium
CN115907021B (en) Quantum calculation-based data clustering method and device and quantum computer
CN116089675A (en) Quantum clustering method and device, electronic equipment and storage medium
CN115983392A (en) Method, device, medium and electronic device for determining quantum program mapping relation
CN114881238A (en) Method and apparatus for constructing quantum discriminator, medium, and electronic apparatus
CN115409185A (en) Construction method and device of quantum line corresponding to linear function
CN115907016B (en) Quantum-based method for calculating search target range value and related device
CN115730668B (en) Quantum circuit cutting method and device and quantum computer operating system
CN115423108B (en) Quantum circuit cutting processing method and device and quantum computer operating system
CN116049506B (en) Quantum calculation-based numerical value searching method, device, equipment and storage medium
CN115730669B (en) Quantum circuit processing method and device and quantum computer operating system
CN115936132B (en) Quantum circuit simulation method and related device
CN115511094B (en) Quantum circuit execution result determining method and device and quantum computer operating system
CN116432760A (en) Quantum data classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 230088 6th floor, E2 building, phase II, innovation industrial park, 2800 innovation Avenue, Hefei high tech Zone, Hefei City, Anhui Province

Applicant after: Benyuan Quantum Computing Technology (Hefei) Co.,Ltd.

Address before: 230088 6th floor, E2 building, phase II, innovation industrial park, 2800 innovation Avenue, Hefei high tech Zone, Hefei City, Anhui Province

Applicant before: ORIGIN QUANTUM COMPUTING COMPANY, LIMITED, HEFEI

CB02 Change of applicant information