CN113688906A - Customer segmentation method and system based on quantum K-means algorithm - Google Patents

Customer segmentation method and system based on quantum K-means algorithm Download PDF

Info

Publication number
CN113688906A
CN113688906A CN202110982944.3A CN202110982944A CN113688906A CN 113688906 A CN113688906 A CN 113688906A CN 202110982944 A CN202110982944 A CN 202110982944A CN 113688906 A CN113688906 A CN 113688906A
Authority
CN
China
Prior art keywords
quantum
data
quantum state
sample
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110982944.3A
Other languages
Chinese (zh)
Inventor
李晓瑜
黄思维
张仕斌
昌燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yuanjiang Technology Co ltd
Original Assignee
Sichuan Yuanjiang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yuanjiang Technology Co ltd filed Critical Sichuan Yuanjiang Technology Co ltd
Priority to CN202110982944.3A priority Critical patent/CN113688906A/en
Publication of CN113688906A publication Critical patent/CN113688906A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a customer segmentation method and a system based on a quantum K-means algorithm, wherein the method comprises the following steps: acquiring a customer behavior data set D; according to the sample x in the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents; the client behavior data and the clustering center are subjected to quantum computation, the similarity between each data and the clustering center is output, and the similarity exists in a quantum state | amMore than middle; looking up quantum state | am>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj. The invention standardizes the input data and inputs the data without destroying the data relation, can help enterprises to realize deep analysis clients, and simultaneously, quantumCalculation acceleration, accurate calculation and energy saving brought by calculation.

Description

Customer segmentation method and system based on quantum K-means algorithm
Technical Field
The invention relates to the field of quantum finance, in particular to a customer segmentation method and a customer segmentation system based on a quantum K-means algorithm.
Background
The principle of pareto, also called twenty-eight, plays an important role in the economic field, which considers that, in any case, the main factors influencing the outcome of a transaction are only a small part. A large number of studies have found that it is only 20% of customers that contribute 80% of profits to a business. Because the cost of developing new users in various industries of the financial market is far higher than the cost of reserving customers, the marketization leads the similarity of products and services of various enterprises to be higher and higher, and the development space of the enterprises is limited. The competition is changed, the relationship between the enterprise and the clients is maintained, different clients are layered according to the characteristics of the clients, the resource allocation of the enterprise to different types of clients is optimized, and the maximization of enterprise income is a fundamental requirement of the enterprise for pursuing long-term stable development.
The traditional statistical method has relatively huge consumption on manpower and material resources, and a statistical result has certain errors due to various external factors. Compared with the traditional investigation statistical method, the client behavior data called from the existing database has lower information cost and higher reliability. The machine learning algorithm has the disadvantages that the data volume used by the machine learning algorithm is relatively large, the problems caused by a statistical method are avoided, but the problems are also caused by overlarge data volume, the calculation time consumption is large, the calculation resource consumption is high, and the problems are common defects of the existing algorithm.
The method for distinguishing the customers with different values is crucial to the core development of enterprises, common subdivision methods can be generally divided into pre-subdivision and post-subdivision, the specific method for post-subdivision is generally cluster analysis, and then the value of the class is analyzed according to the obtained classes. Meanwhile, the existing customer segmentation method is a split demonstration method, and common laws of various disciplines cannot be connected in series. Before the customer information is processed by using the cluster analysis algorithm, a certain method is usually selected to preprocess the data, such as principal component analysis, due to the complex data structure. The data preprocessing can actually bring certain calculation benefits, but the processed data lose the implicit relation among the original data, and the integrity of data information is reduced. The idea of the classic k-means algorithm is simple and easy to realize, but due to the calculation characteristic, clustering needs to be realized through repeated iterative calculation. When a complex and large-amount computing environment is faced, the computing complexity of the complex and large-amount computing environment can be increased greatly along with the increase of the data amount, so that the k-means algorithm is not suitable for the clustering problem with the large data amount generally. Quantum computing is based on the basic characteristics of quantum, has powerful parallel computing capability and higher data accommodation capability, and when large data is processed, the performance of quantum computing far exceeds the operation processing capability of a classical computer. The quantum k-means algorithm is a k-means algorithm based on a quantum computing theory, and can effectively improve the computing efficiency of the k-means algorithm and reduce the space complexity.
Therefore, in order to effectively solve the problems of low customer information processing speed, low efficiency and high energy consumption of an enterprise facing a big data environment, the method and the system for subdividing the customers based on the quantum K-means algorithm are provided, the possibility is provided for analyzing and researching the customer subdivision in a system integral mode, the customer image is depicted more completely and strictly, and the method and the system belong to the problems to be solved in the field.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a customer segmentation method and a customer segmentation system based on a quantum K-means algorithm.
The purpose of the invention is realized by the following technical scheme:
the invention provides a customer segmentation method based on a quantum K-means algorithm, which comprises the following steps:
determining a subdivision angle, namely the characteristic quantity D, and acquiring a client behavior data set D;
according to the sample x in the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
the customer behavior data and the clustering center are subjected to quantum computation, and the similarity between each data and the clustering center is output, namely the quantum state | x is computedm>And | c>The similarity exists in a quantum state | am>Performing the following steps;
looking up quantum state | am>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
Further, the obtaining of the customer behavior data set D includes:
data extraction: extracting required data from a database;
data cleaning: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective;
data conversion: converting different types of data into quantum k-means algorithms can use the type of quantum state.
Further, the sample x in the data set D according to the customer behaviormCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure BDA0003229800690000021
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure BDA0003229800690000022
in the formula, cijThe jth feature representing the ith cluster center c.
Further, the customer behavior data and the clustering centers are subjected to quantum computing, and the similarity between each data and the clustering centers is output, namely, the quantum state | x is computedm>And | c>The similarity exists in a quantum state | alphamIn (6), the method comprises:
and controlling the switching gate to calculate a similarity result | ψ >, which has the formula:
Figure BDA0003229800690000031
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure BDA0003229800690000032
s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure BDA0003229800690000033
further, the quantum state | α is searchedm>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cjThe method comprises the following steps:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure BDA0003229800690000034
Here, by continuously iterating through | am>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | αm>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center;
if | c'j-xm|<|cj-xmL, then c'jReplacement cj
In a second aspect of the present invention, there is provided a customer segmentation system based on quantum K-means algorithm, comprising:
an angle subdivision module: the method comprises the steps of determining subdivision angles, namely feature quantity D, and obtaining a customer behavior data set D;
quantum state conversion module: for determining the sample x from the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
a similarity calculation module: the method is used for outputting the similarity between each data and the clustering center through quantum computation of customer behavior data and the clustering center, namely computing the quantum state | xm>And | c>The similarity exists in a quantum state | alpham>Performing the following steps;
a clustering center searching module: for finding quantum state | αm>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
Further, the obtaining of the customer behavior data set D includes:
a data extraction submodule: extracting required data from a database;
a data cleaning submodule: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective;
a data conversion submodule: converting different types of data into quantum k-means algorithms can use the type of quantum state.
Further, the sample x in the data set D according to the customer behaviormCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure BDA0003229800690000041
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure BDA0003229800690000042
in the formula, cijThe jth feature representing the ith cluster center c.
Further, the similarity calculation module includes:
and controlling the switching gate to calculate a similarity result | ψ >, wherein the formula is as follows:
Figure BDA0003229800690000043
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure BDA0003229800690000044
s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure BDA0003229800690000045
further, the cluster center searching module includes:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure BDA0003229800690000046
Here, by continuously iteratively looking up | αm>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | αm>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center;
if | c'j-xm|<|cj-xmL, then c'jReplacement cj
In a third aspect of the invention, a storage medium is provided having stored thereon computer instructions which, when executed, perform the steps of the method for quantum hidden markov model solution fraud detection.
In a fourth aspect of the present invention, there is provided a terminal comprising a memory and a processor, wherein the memory stores computer instructions executable on the processor, and the processor executes the computer instructions to perform the steps of the method for solving fraud detection using quantum hidden markov models.
The invention has the beneficial effects that:
(1) in an exemplary embodiment of the present invention, a traditional customer segmentation implementation needs to process data first, which is at the cost of losing data integrity although obtaining computational benefits, whereas the customer segmentation method based on the quantum k-means algorithm in the exemplary embodiment normalizes input data and inputs the input data without destroying data relationships, which can help an enterprise to implement deep analysis of customers, and meanwhile, computation acceleration, computation accuracy and computation energy saving brought by quantum computation are also expected by the enterprise.
Exemplary embodiments of the system, the storage medium and the terminal of the present invention also have the same advantages.
(2) In yet another exemplary embodiment of the present invention, customer data is obtained and the required data is extracted from an existing database or authorized database within the enterprise as a training data set based on the requirements of the enterprise. The obtained data needs to be standardized, and model processing is facilitated. But the specification step does not disrupt the data relationships.
(3) In yet another exemplary embodiment of the present invention, specific embodiments of the subsequent steps are disclosed.
Drawings
Fig. 1 is an inventive flow chart provided by an exemplary embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that directions or positional relationships indicated by "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", and the like are directions or positional relationships described based on the drawings, and are only for convenience of description and simplification of description, and do not indicate or imply that the device or element referred to must have a specific orientation, be configured and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, it should be noted that, unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The existing customer segmentation method is a split demonstration method and cannot connect the common laws of various disciplines in series. Before the customer information is processed by using the cluster analysis algorithm, a certain method is usually selected to preprocess the data, such as principal component analysis, due to the complex data structure. The data preprocessing can actually bring certain calculation benefits, but the processed data lose the implicit relation among the original data, and the integrity of data information is reduced. The idea of the classic k-means algorithm is simple and easy to realize, but due to the calculation characteristic, clustering needs to be realized through repeated iterative calculation. When a complex and large-amount computing environment is faced, the computing complexity of the complex and large-amount computing environment can be increased greatly along with the increase of the data amount, so that the k-means algorithm is not suitable for the clustering problem with the large data amount generally. Quantum computing is based on the basic characteristics of quantum, has powerful parallel computing capability and higher data accommodation capability, and when large data is processed, the performance of quantum computing far exceeds the operation processing capability of a classical computer. The quantum k-means algorithm is a k-means algorithm based on a quantum computing theory, and can effectively improve the computing efficiency of the k-means algorithm and reduce the space complexity.
The essence of the client subdivision method is that an enterprise divides a data set D into subclasses Y with a certain amount by using a certain subdivision standard M through an existing client information data set DiWherein i is more than or equal to 2, and represents the number of subclasses, namely the number of classified customer groups. Specifically, the customer information data set D containing n samples is defined as (D)1,d2,...,dn) Divided into k groups Y ═ Y (Y)1,Y2,...,Yk). At the same time, each subclass Y is requirediNon-null, each sample can ultimately only belong to one class. The following exemplary embodiments perform customer segmentation based on customer behavior in the context of a power grid, and other fields may be used and perform corresponding operations, which are not limited herein.
Referring to fig. 1, fig. 1 illustrates a customer segmentation method based on a quantum K-means algorithm according to an exemplary embodiment of the present invention, including:
determining a subdivision angle, namely the characteristic quantity D, and acquiring a client behavior data set D;
according to the sample x in the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
the customer behavior data and the clustering center are subjected to quantum computation, and the similarity between each data and the clustering center is output, namely the quantum state | x is computedm>And | c>The similarity exists in a quantum state | alpham>Performing the following steps;
looking up quantum state | alpham>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
The determination of the subdivision angle can enable an enterprise to divide customers into a plurality of groups according to certain standards and own requirements. Common subdivision angles include: geographic subdivision, demographic subdivision, psychological subdivision, behavioral subdivision, and the like. The subdivision angle selected by the exemplary embodiment of the present invention is a behavior subdivision, i.e., a customer behavior, and the determination of the subdivision angle determines the data set used as a customer behavior data set. The number of data features in the data set used is determined to be d.
The number of the clustering centers is the number of the classifications, and the samples near the clustering centers are of one class.
The invention provides a customer segmentation method based on a quantum machine learning algorithm, the traditional customer segmentation needs to process data firstly, the method obtains calculation benefits but costs the loss of data integrity, and the customer segmentation method based on the quantum machine learning algorithm in the exemplary embodiment normalizes and inputs input data without destroying data relation, can help enterprises to realize deep analysis customers, and meanwhile, the calculation acceleration, the calculation accuracy and the calculation energy saving brought by quantum calculation are also expected by the enterprises.
The following exemplary embodiments will set forth the individual steps in detail.
Preferably, in an exemplary embodiment, the obtaining the customer behavior data set D includes:
data extraction: extracting required data from a database;
data cleaning: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective; this content prevents abnormal data from adversely affecting the clustering process.
Data conversion: converting different types of data (different formats, types, distributions) into quantum k-means algorithms can use the type of quantum state.
Specifically, in the exemplary embodiment, to obtain customer data, the required data is extracted from an existing database or authorized database within the enterprise as a training data set according to the enterprise requirements. The obtained data needs to be standardized, and model processing is facilitated.
After obtaining the customer data, it is then necessary to determine the subdivision angle, the different subdivision angles determining the results of the data analysis. The exemplary embodiment starts from the analysis of the power grid customer behavior, and researches the characteristics of the grouped customer groups. The power grid client behavior data comprises daily load conditions, summer/winter electricity consumption, load factors, payment types, consultation hot line times, online login times, business handling and other preferences.
And then selecting a certain clustering analysis method, selecting different clustering centers and obtaining the customer detail classification. The initially selected clustering center is used for screening the power grid customer behavior data according to a factor analysis method and can also be selected according to experience, but the clustering center selected according to experience generally has subjectivity, and the clustering result is not scientific and objective. The exemplary embodiment uses a factor analysis method to generate a clustering center, and uses the filtered variables as the clustering center to allow the algorithm to converge as soon as possible. The subdivision method used in the present exemplary embodiment is a quantum k-means algorithm, and the specific steps are as follows:
(1) preparing quantum state, normalizing the power grid customer behavior data set D, and performing normalization processing on n samples x in the data set DmThere are d features, each feature in each sample being represented by xmjRepresents; the cluster centers are denoted by c, there are k cluster centers, c ═ c1,c2,...,ckC for each cluster centeriAnd (4) showing. Power grid customer behavior data is used as input of quantum k-means algorithm and quantum state | xm>Representing, selected for the cluster center, the quantum state | c>And (4) showing. For example, x0jA jth eigenvalue representing a 0 th data point; c. CijThe jth eigenvalue representing the ith cluster center. This section mainly achieves the conversion of classical data to quantum states.
Preferably, in an exemplary embodiment, the samples x in the customer behavior based dataset DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure BDA0003229800690000081
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure BDA0003229800690000082
in the formula, cijThe jth feature representing the ith cluster center c.
(2) Calculating the similarity between the power grid customer behavior data sample and the clustering center, namely calculating the quantum state | xm>And | c>And storing the similarity between the data sample and the clustering center in the quantum bit by using a phase estimation algorithm, wherein the Control-Swap Gate (Control-Swap Gate) is used for calculating the similarity. Controlling the result | ψ of the calculation of the switching gate>Obtaining c through a phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>I.e., the smaller the value, the higher the similarity. The part is mainly input with power grid customer behavior data and the clustering center, and similarity between each data and the clustering center is output through quantum calculation.
Preferably, in an exemplary embodiment, the customer behavior data and the cluster center are subjected to quantum computation, and the similarity between each data and the cluster center is output, namely, a quantum state | x is computedm>And | c>The similarity exists in a quantum state | alpham>In, comprising:
and controlling the switching gate to calculate a similarity result | ψ >, wherein the formula is as follows:
Figure BDA0003229800690000091
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure BDA0003229800690000092
s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure BDA0003229800690000093
wherein, the quantum phase estimation method can calculate the phase of the target quantum state, and is realized mainly by quantum Fourier transform
Figure BDA0003229800690000094
Wherein the quantum state | ψ>As an input xi|j>,||ci-xm|>I.e. is the output yk|k>. In addition, | ci-xm|>I.e. is s (x)m,ci)。
(3) Searching the similarity maximum value between the power grid customer behavior data sample and the clustering center, | a>In the presence of nk | | ci-xm|>Value, | αm>In which there are k | | ci-xm|>Finding quantum state | alpha by using quantum minimum value search algorithmm>Middle data sample | xm>And cluster center | ci>The minimum value in between. The method is characterized in that a value with the best clustering effect generated by the power grid customer behavior data through a quantum-minimum search algorithm is found through a quantum-minimum search algorithm.
More preferably, in an exemplary embodiment, the looking-up quantum state | αmX of data samplem>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cjThe method comprises the following steps:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure BDA0003229800690000101
Here, by continuously iterating through | am>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | am>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center; (| b)>The control input can be a condition control input, and when the total input satisfies a certain condition, a desired output can be obtained by the control input b)
If | c'j-xm|<|cj-xmL, then c'jReplacement cj
Finally, the result of statistic calculation is synthesized, and c can be found by a quantum minimum value search algorithmjAnd xmNearest cluster center, will xmDue to cjAnd the enterprise is helped to realize the analysis of the power grid customer behavior data through a quantum k-means algorithm. After the power grid customer behavior subdivision is completed, power utilization personalized services can be provided for different customers according to actual markets, a differentiated value-added service scheme is achieved, and enterprises are helped to create income stably.
With the same inventive concept as the above-described exemplary embodiment, still another exemplary embodiment of the present invention provides a customer segmentation system based on a quantum K-means algorithm, including:
an angle subdivision module: the method comprises the steps of determining subdivision angles, namely feature quantity D, and obtaining a customer behavior data set D; (ii) a
Quantum state conversion module: for determining the sample x from the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
a similarity calculation module: the method is used for outputting the similarity between each data and the clustering center through quantum computation of customer behavior data and the clustering center, namely computing the quantum state | xm>Similarity of | c >, existence of similarity in quantum state | αm>Performing the following steps;
a clustering center searching module: for finding quantum state | αm>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
Correspondingly, in an exemplary embodiment, the obtaining the customer behavior data set D includes:
a data extraction submodule: extracting required data from a database;
a data cleaning submodule: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective;
a data conversion submodule: converting different types of data into quantum k-means algorithms can use the type of quantum state.
Correspondingly, in an exemplary embodiment, the samples x in the customer behavior-based dataset DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure BDA0003229800690000111
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure BDA0003229800690000112
in the formula, cijThe jth feature representing the ith cluster center c.
Correspondingly, in an exemplary embodiment, the similarity calculation module includes:
and controlling the switching gate to calculate a similarity result | ψ >, wherein the formula is as follows:
Figure BDA0003229800690000113
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure BDA0003229800690000114
s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure BDA0003229800690000115
correspondingly, in an exemplary embodiment, the cluster center searching module includes:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure BDA0003229800690000116
Here, by continuously iteratively looking up | αm>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | αm>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center;
if | c'j-xm|<|cj-xmL, then c'jReplacement cj
Having the same inventive concept as the above-described exemplary embodiments, an exemplary embodiment of the present invention provides a storage medium having stored thereon computer instructions that, when executed, perform the steps of the method for quantum hidden markov model solution fraud detection.
Having the same inventive concept as the above-described exemplary embodiments, an exemplary embodiment of the present invention provides a terminal, including a memory and a processor, where the memory has stored thereon computer instructions executable on the processor, and the processor executes the computer instructions to perform the steps of the method for quantum hidden markov model solution fraud detection.
Based on such understanding, the technical solution of the present embodiment or parts of the technical solution may be essentially implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It is to be understood that the above-described embodiments are illustrative only and not restrictive of the broad invention, and that various other modifications and changes in light thereof will be suggested to persons skilled in the art based upon the above teachings. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the spirit or scope of the invention.

Claims (10)

1. The customer segmentation method based on the quantum K-means algorithm is characterized by comprising the following steps of: the method comprises the following steps:
determining a subdivision angle, namely the characteristic quantity D, and acquiring a client behavior data set D;
according to the sample x in the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
the customer behavior data and the clustering center are subjected to quantum computation, and the similarity between each data and the clustering center is output, namely the quantum state | x is computedm>And | c>The similarity exists in a quantum state | alpham>Performing the following steps;
looking up quantum state | alpham>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
2. The quantum K-means algorithm based customer segmentation method of claim 1, wherein: the obtaining of the customer behavior data set D includes:
data extraction: extracting required data from a database;
data cleaning: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective;
data conversion: converting different types of data into quantum k-means algorithms can use the type of quantum state.
3. The quantum K-means algorithm based customer segmentation method of claim 1, wherein: the samples x in the data set D according to the customer behaviormCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure FDA0003229800680000011
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure FDA0003229800680000012
in the formula, cijThe jth feature representing the ith cluster center c.
4. The quantum K-means algorithm based customer segmentation method of claim 3, wherein: the customer behavior data and the clustering center are subjected to quantum computation, and the similarity between each data and the clustering center is output, namely, the quantum state | x is computedm>And | c>The similarity exists in a quantum state | alpham>In, comprising:
and controlling the switching gate to calculate a similarity result | ψ >, wherein the formula is as follows:
Figure FDA0003229800680000021
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure FDA0003229800680000022
,s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure FDA0003229800680000023
5. the quantum K-means algorithm based customer segmentation method of claim 4, wherein: said finding a quantum state | αm>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cjThe method comprises the following steps:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure FDA0003229800680000024
Here, by continuously iteratively looking up | αm>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | am>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center;
if | c'j-xm|<|cj-xmL, then c'jReplacement cj
6. The customer segmentation system based on the quantum K-means algorithm is characterized in that: the method comprises the following steps:
an angle subdivision module: determining a subdivision angle, namely the characteristic quantity D, and acquiring a client behavior data set D;
quantum state conversion module: for determining the sample x from the customer behavior data set DmCharacteristic value of (2), sample xmConversion to quantum state | xm>Represents; and according to the selected k cluster centers ciThe characteristic value of (2) converts the clustering center c into a quantum state | c>Represents;
a similarity calculation module: the method is used for outputting each data and cluster by quantum computing the customer behavior data and the cluster centerSimilarity between centres, i.e. computing quantum state | xm>And | c>The similarity exists in a quantum state | alpham>Performing the following steps;
a clustering center searching module: for finding quantum state | αm>Middle data sample | xm>And cluster center | ci>So as to find the minimum value of (c) with sample xmNearest cluster center cj
7. The quantum K-means algorithm based customer segmentation system of claim 6, wherein: the obtaining of the customer behavior data set D includes:
a data extraction submodule: extracting required data from a database;
a data cleaning submodule: checking all variables for missing, unknown, invalid or valid values; then, according to the variable distribution characteristics and the actual requirements, adopting corresponding rules to update the missing value, the unknown value and the invalid value to be effective;
a data conversion submodule: converting different types of data into quantum k-means algorithms can use the type of quantum state.
8. The quantum K-means algorithm based customer segmentation system of claim 6, wherein: the samples x in the data set D according to the customer behaviormCharacteristic value of (2), sample xmConversion to quantum state | xm>Expressed, the conversion formula is:
Figure FDA0003229800680000031
in the formula, xmjDenotes the m-th sample xmThe jth feature of (1);
the k cluster centers c according to the selectioniThe characteristic value of (2) converts the clustering center c into a quantum state | c>Expressed, the conversion formula is:
Figure FDA0003229800680000032
in the formula, cijThe jth feature representing the ith cluster center c.
9. The quantum K-means algorithm based customer segmentation system of claim 8 wherein: the similarity calculation module includes:
and controlling the switching gate to calculate a similarity result | ψ >, wherein the formula is as follows:
Figure FDA0003229800680000033
in the formula, n represents a sample xmThe number of the (c) component(s),
Figure FDA0003229800680000034
s(xm,ci) Denotes xmAnd ciThe similarity of (2);
quantum state | ψ>The output of the phase estimation algorithm is | | c as input to the phase estimation algorithmi-xm|>This is the data sample | xm>And cluster center | c>Similarity between them, exist in quantum state | alpham>In, the formula is:
Figure FDA0003229800680000035
10. the quantum K-means algorithm based customer segmentation system of claim 9, wherein: the cluster center searching module comprises:
randomly selecting a cluster center ciAs an initial value, the following steps are then repeated
Figure FDA0003229800680000041
Here, by successive iterative checksFinding | αm>Minimum of (2):
preparing initial value c of clustering centeriQuantum state of (b) is | beta>;
Will | αm>、|β>As input, | b>C 'is found by utilizing a Grover algorithm as a control input'jC of wherein'jRepresenting a temporary cluster center;
if | c'j-xm|<|cj-xmL, then c'jReplacement cj
CN202110982944.3A 2021-08-25 2021-08-25 Customer segmentation method and system based on quantum K-means algorithm Pending CN113688906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110982944.3A CN113688906A (en) 2021-08-25 2021-08-25 Customer segmentation method and system based on quantum K-means algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110982944.3A CN113688906A (en) 2021-08-25 2021-08-25 Customer segmentation method and system based on quantum K-means algorithm

Publications (1)

Publication Number Publication Date
CN113688906A true CN113688906A (en) 2021-11-23

Family

ID=78582677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110982944.3A Pending CN113688906A (en) 2021-08-25 2021-08-25 Customer segmentation method and system based on quantum K-means algorithm

Country Status (1)

Country Link
CN (1) CN113688906A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219048A (en) * 2022-02-21 2022-03-22 合肥本源量子计算科技有限责任公司 Spectral clustering method and device based on quantum computation, electronic equipment and storage medium
CN114282000A (en) * 2022-02-21 2022-04-05 合肥本源量子计算科技有限责任公司 Text clustering method, text clustering device, text clustering medium and electronic device based on quantum computation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117403A1 (en) * 2001-05-14 2004-06-17 David Horn Method and apparatus for quantum clustering
CN110852380A (en) * 2019-11-11 2020-02-28 安徽师范大学 Quantum ant lion and k-means based clustering method and intrusion detection method
US20200410380A1 (en) * 2019-06-28 2020-12-31 International Business Machines Corporation Unsupervised clustering in quantum feature spaces using quantum similarity matrices
CN112686328A (en) * 2021-01-06 2021-04-20 成都信息工程大学 Data classification system and method based on quantum fuzzy information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117403A1 (en) * 2001-05-14 2004-06-17 David Horn Method and apparatus for quantum clustering
US20200410380A1 (en) * 2019-06-28 2020-12-31 International Business Machines Corporation Unsupervised clustering in quantum feature spaces using quantum similarity matrices
CN110852380A (en) * 2019-11-11 2020-02-28 安徽师范大学 Quantum ant lion and k-means based clustering method and intrusion detection method
CN112686328A (en) * 2021-01-06 2021-04-20 成都信息工程大学 Data classification system and method based on quantum fuzzy information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
K. BENLAMINE: "Quantum Collaborative K-means", 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), pages 1 - 7 *
余健等: "一种基于聚类分析的电力计量自动化检定流水线故障诊断方法", 电子设计工程, vol. 28, no. 8, pages 76 - 79 *
刘雪娟等: "量子k-means算法", 吉林大学学报(工学版), vol. 48, no. 2, pages 2 *
李杰: "基于聚类算法的电力客户行为优化模型研究", 中国优秀硕士学位论文全文数据库工程科技Ⅱ辑, no. 2, pages 1 - 3 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219048A (en) * 2022-02-21 2022-03-22 合肥本源量子计算科技有限责任公司 Spectral clustering method and device based on quantum computation, electronic equipment and storage medium
CN114282000A (en) * 2022-02-21 2022-04-05 合肥本源量子计算科技有限责任公司 Text clustering method, text clustering device, text clustering medium and electronic device based on quantum computation

Similar Documents

Publication Publication Date Title
Pérez-Martín et al. Big Data techniques to measure credit banking risk in home equity loans
US7930242B2 (en) Methods and systems for multi-credit reporting agency data modeling
US9489627B2 (en) Hybrid clustering for data analytics
Noirhomme‐Fraiture et al. Far beyond the classical data models: symbolic data analysis
Afonso et al. Housing prices prediction with a deep learning and random forest ensemble
CN110956273A (en) Credit scoring method and system integrating multiple machine learning models
US11636486B2 (en) Determining subsets of accounts using a model of transactions
JP2002543538A (en) A method of distributed hierarchical evolutionary modeling and visualization of experimental data
CN110866782A (en) Customer classification method and system and electronic equipment
US10956825B1 (en) Distributable event prediction and machine learning recognition system
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
CN112381154A (en) Method and device for predicting user probability and computer equipment
CN113688906A (en) Customer segmentation method and system based on quantum K-means algorithm
CN107704883A (en) A kind of sorting technique and system of the grade of magnesite ore
CN110929525A (en) Network loan risk behavior analysis and detection method, device, equipment and storage medium
CN111460161A (en) Unsupervised text theme related gene extraction method for unbalanced big data set
CN112836750A (en) System resource allocation method, device and equipment
CN112348685A (en) Credit scoring method, device, equipment and storage medium
CN107203772A (en) A kind of user type recognition methods and device
CN112529319A (en) Grading method and device based on multi-dimensional features, computer equipment and storage medium
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
CN117035983A (en) Method and device for determining credit risk level, storage medium and electronic equipment
CN116502898A (en) Enterprise risk portrait generation method and device based on neural network
CN113988878B (en) Graph database technology-based anti-fraud method and system
Yu et al. Computer Image Content Retrieval considering K‐Means Clustering Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination