CN112488228A - Bidirectional clustering method for wind control system data completion - Google Patents

Bidirectional clustering method for wind control system data completion Download PDF

Info

Publication number
CN112488228A
CN112488228A CN202011439471.4A CN202011439471A CN112488228A CN 112488228 A CN112488228 A CN 112488228A CN 202011439471 A CN202011439471 A CN 202011439471A CN 112488228 A CN112488228 A CN 112488228A
Authority
CN
China
Prior art keywords
clustering
matrix
formula
clusters
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011439471.4A
Other languages
Chinese (zh)
Inventor
郑小禄
诸葛天心
刘羽中
胡亮
仵伟强
尹昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingke Internet Technology Shandong Co ltd
Original Assignee
Jingke Internet Technology Shandong Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingke Internet Technology Shandong Co ltd filed Critical Jingke Internet Technology Shandong Co ltd
Priority to CN202011439471.4A priority Critical patent/CN112488228A/en
Publication of CN112488228A publication Critical patent/CN112488228A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of cluster analysis, in particular to a bidirectional clustering method facing wind control system data completion.

Description

Bidirectional clustering method for wind control system data completion
Technical Field
The invention relates to the technical field of cluster analysis, in particular to a bidirectional clustering method for wind control system data completion.
Background
With the development of information technology and the internet, more and more machine learning algorithms are applied to the traditional financial field. In the traditional financial field, attention is paid to how to perform financial wind control through big data combined with machine learning. Most of the traditional wind control models are built on a supervised learning task with labels. However, with the increasing of data volume, storage errors, unreliable acquisition equipment, unstable network state, malicious fraud of users and other reasons, most of the acquired data are incomplete. And the incomplete data may be redundant, noisy, missing, etc. Data loss is a common phenomenon in a wind control system, and the lost data volume grows exponentially along with the growth of the user scale and the service scale. Missing data affects the accuracy and reliability of wind control decisions, e.g., various mature wind control models based on structured integrity data do not have any place to use; failure to make decisions due to data loss, etc. Data loss brings many adverse effects to the wind control system, which not only affects user experience, but also improves decision risk.
The potential factor model based on matrix decomposition is widely used in data completion facing wind control systems. However, the conventional latent factor model can only be complemented from a single dimension, with a loss of accuracy. The full utilization of information from multiple dimensions has become an important research direction for data completion.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a bidirectional clustering method for wind control system data completion to solve the problems of insufficient speed and efficiency of data missing completion.
The invention is realized by the following technical scheme: a bidirectional clustering method facing wind control system data completion comprises five steps of example clustering, attribute clustering, local matrix construction, local matrix filling and matrix filling, wherein:
the example clustering is to distribute the sample points into different clusters I, the mass center of each cluster I is different, the mass center of each cluster I is obtained by updating a first formula, the similarity in the example clustering is calculated by a first distance calculation formula, and the first distance calculation formula is
Figure BDA0002821820830000021
Wherein D represents the number of attributes of the data object, and the formula of the subset c allocated to the cluster in the example cluster is
Figure BDA0002821820830000022
The attribute clustering is to perform attribute dimension clustering on data obtained by example clustering and allocate the data to different clusters II, wherein the centroids of the clusters II are different, the centroids of the clusters II are obtained by updating a formula II, the similarity in the attribute clustering is calculated by a distance formula I, and the formula of a subset d allocated to the clusters in the attribute clustering is shown as
Figure BDA0002821820830000023
The local matrix is constructed to carry out united clustering on example clustering and attribute clustering to obtain a local matrix;
the local matrix filling is to fill the local matrix with a potential factor model according to the relevance of the user and the item to obtain a complete matrix, wherein the potential factor model is A ═ UVTWherein A is a local model, and U and V are potential factor matrixes of users and characteristic items respectively;
and the matrix filling is to fill the filled local matrix into the matrix to obtain a complete matrix.
Further, both the first update formula and the second update formula are
Figure BDA0002821820830000024
Wherein Center is defined as the centroid of the kth cluster, CenterkRepresents the kth class cluster, | CkAnd | represents the number of data objects in the kth class cluster.
Further, the Center is calculatedkAnd then, selecting the point which is closest to the centroid from the sample points, and updating the point to the centroid.
The invention has the beneficial effects that: the method mainly aims at high similarity in clusters and low similarity between clusters, sample points are distributed to different clusters, the centroid obtained by the example clustering is subjected to attribute dimension clustering by the attribute clustering, information of the example dimension and the attribute dimension is fully considered, potential rules among rows and columns are effectively captured by combined clustering, a local matrix is constructed according to the potential rules, users and items in the local matrix have strong correlation, and the local matrix is filled through a potential factor model. Compared with the existing method based on potential factor filling, such as matrix decomposition, multi-clustering and the like, the method captures local information from two dimensions through example clustering and attribute clustering, and has more sufficient mining and utilization on the local information, thereby obtaining a better completion effect.
Drawings
FIG. 1 is a schematic flow chart of the main steps of the present invention;
FIG. 2 is a flow chart of the overall algorithm process of the present invention;
FIG. 3 is a data diagram of the present invention;
fig. 4 is a visual comparison diagram of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the following examples, the samples are shown in attached Table 1,
attached table 1
Figure BDA0002821820830000031
Figure BDA0002821820830000041
Appendix 1 is part of a public data set "Lendingclub" that provides personal information and performance of a customer, often used for the accuracy judgment of a test algorithm for whether a customer performs or not based on the customer personal information. Each row in appendix 1 is information of a client, each column is all attributes of the client, and the last column is the performance of the client, a label that is typically used for algorithmic prediction of whether a client performs or not.
In an example cluster, the u-th subset c is obtaineduIs of the formula
Figure BDA0002821820830000042
Wherein R is the data of the whole table, Ru,:For a local example matrix composed of all rows belonging to the u-th subset in the entire table, vcThe centroid vector, which is the u-th subset, is the representative eigenvector of this local matrix.
FIG. 2 is an example subset matrix.
Attached table 2
Figure BDA0002821820830000043
The attribute clustering is to perform attribute dimension clustering on data obtained by example clustering and allocate the data to different attributesIn the second cluster, the centroids of the second clusters are different, the centroids of the second clusters are obtained by updating a second formula, the similarity in the attribute clustering is calculated by a first distance formula, and the mth subset d is obtained in the attribute clusteringmIs given by the formula
Figure BDA0002821820830000051
Wherein
Figure BDA0002821820830000052
The local matrix data obtained for attribute clustering, as shown in the attached table 3,
Figure BDA0002821820830000053
a local attribute matrix, V, composed for all columns of the entire table belonging to the mth subset:,dThe centroid vector, which is the mth subset, is the representative eigenvector of this local matrix. FIG. 3 is an example of an attribute subset matrix. It is noted that "default" is usually regarded as a label, not an attribute, and therefore when an attribute is clustered, the attribute data is usually deleted and then clustered, that is, the clustering operation is performed
Figure BDA0002821820830000054
The column "default or not" is not included in this.
Attached table 3
Figure BDA0002821820830000055
Example 1
As shown in fig. 1 to 3, a bidirectional clustering method for wind control system data completion includes five steps of example clustering, attribute clustering, local matrix construction, local matrix filling, and matrix filling, where:
the example clustering is to distribute the sample points into different clusters I, the centroids of the clusters I are different, the centroids of the clusters I are obtained by updating a first formula, the similarity in the example clustering is calculated by a first distance calculation formula, and the first distance calculation formula is
Figure BDA0002821820830000056
Where D represents the number of attributes of the data object, and the formula of the subset c assigned to the cluster in the example cluster is
Figure BDA0002821820830000057
The attribute clustering is to perform attribute dimension clustering on data obtained by example clustering and distribute the data into different second clusters, wherein the centroids of the second clusters are different, the centroids of the second clusters are obtained by updating a formula II, the similarity in the attribute clustering is calculated by a distance formula I, and the mth subset d is obtained in the attribute clusteringmIs given by the formula
Figure BDA0002821820830000061
Wherein d ismThe local matrix data obtained for the attribute clustering,
Figure BDA0002821820830000062
a local attribute matrix, V, composed for all columns of the entire table belonging to the mth subset:,dThe centroid vector, which is the mth subset, is the representative eigenvector of this local matrix.
The local matrix is constructed to carry out united clustering on the example clustering and the attribute clustering to obtain a local matrix;
the local matrix filling is to fill the local matrix with a potential factor model according to the relevance of the user and the item to obtain a complete matrix, wherein the potential factor model is A ═ UVTWherein A is a local model, U and V are potential factor matrixes of users and characteristic items respectively, and the first updating formula and the second updating formula are both
Figure BDA0002821820830000063
Wherein, the CenterkDefined as the centroid of the kth cluster, CenterkRepresents the kth class cluster, | CkI represents the number of data objects in the kth class cluster, and calculates the CenterkAnd then, selecting the point which is closest to the centroid from the sample points, and updating the point to the centroid.
The matrix filling is to fill the filled local matrix into the matrix to obtain a complete matrix, and fill the missing data.
Taking the sample processing of attached table 1 as an example,
the bidirectional clustering method for wind control system data completion comprises the following operation steps,
step 1, inputting missing wind control data, see attached table 1;
step 2, constructing a model, setting parameters kn, km, I and J, wherein kn is the number of row vector clustering centroids, the value is related to the number of users, kn is 3 for the sample data in the attached table 1, km is the number of column vector clustering centroids, the value is related to the number of attributes, km is 2 for the sample data in the attached table 1, I, J are the maximum iteration times, the value is related to the matrix row dimension, I is 5 for the sample data in the attached table 1, the iteration times I is 0, and J is 0.
Step 3, randomly selecting kn user vectors from the wind control data as representative user vectors in an attached table 1 to obtain a first mass center, and obtaining kn mass vectors as shown in an attached table 4, wherein each row is a mass center vector;
attached table 4
Figure BDA0002821820830000071
Step 4, calculating the distance from each user vector to kn centroid vectors according to a first distance formula, wherein the first distance formula is
Figure BDA0002821820830000072
D represents the number of attributes of the data objects, the class of the user vector is divided into the centroid vector closest to the class of the user vector, and kn clusters I are obtained, wherein the three clusters are respectively shown in an attached table 2, an attached table 5 and an attached table 6;
attached table 5
16 5000 704 0.11 9 6 0.47 8 36 0.12 Performing contract
3 4000 689 0.22 0.58 16 36 0.16 Performing contract
20 10225 689 0.33 30 0.7 52 0.16 Performing contract
18 6000 679 11 10 0.3 38 36 0.08 Performing contract
19 24000 679 0.25 20 29 36 0.12 Performing contract
7 3000 674 0.15 32 10 0.34 25 36 0.16 Performing contract
2 6000 669 0.08 37 1 8 36 0.12 Performing contract
6 3000 669 0.29 4 36 0.16 Performing contract
13 5000 669 0.19 10 10 0.51 41 36 0.09 Performing contract
Attached table 6
14 35000 669 0.17 23 0.87 53 60 0.19 Performing contract
24 14400 669 0.27 37 10 0.74 29 60 0.19 Default
1 19150 0.13 11 1 0.39 41 36 0.19 Performing contract
5 12000 0.06 33 10 0.8 5 60 0.14 Performing contract
11 5700 0.15 16 6 0.34 36 0.07 Performing contract
17 9600 0.15 10 6 0.86 36 0.11 Performing contract
23 14000 0.13 32 9 22 36 0.16 Default
Step 5, averaging the cluster I through a centroid formula, wherein the centroid updating formula is
Figure BDA0002821820830000073
CenterkCentroid, C, defined as the kth cluster onekRepresents the kth class cluster, | CkI represents the number of data objects in the kth cluster I to obtain a second centroid;
step 6, judging whether the iteration frequency I is equal to I or not, if not, executing the step 4, and if so, executing the step 7;
and 7, transposing the obtained second centroid matrix, as shown in an attached table 7, obtaining the second centroid matrix, randomly selecting km centroid vectors from the second centroid matrix, as shown in an attached table 8, wherein each row is one centroid vector.
Attached table 7
User ID 7 21 24
Amount of loan 3000 6500 14400
Credit score value 674 714 669
Debt to income ratio 0.15 0.21 0.27
Province of labor 32 37 37
Length of operation 10 10 10
Turnover limit utilization rate 0.34 0.75 0.74
Opening account number 25 12 29
Number of loan payments 36 36 60
Interest rate 0.16 0.12 0.19
Whether or not there is a default Performing contract Performing contract Default
Attached table 8
Credit score value 674 714 669
Opening account number 25 12 29
Step 8, calculating the distance from each column to a third centroid through a first distance formula, dividing the class of each column to the third centroid closest to the class of each column to form km clusters II, wherein the two clusters II are respectively shown in an attached table 9 and an attached table 10;
attached watch 9
Amount of loan 3000 6500 14400
Credit score value 674 714 669
Province of labor 32 37 37
Opening account number 25 12 29
Number of loan payments 36 36 60
Attached watch 10
Debt to income ratio 0.15 0.21 0.27
Length of operation 10 10 10
Turnover limit utilization rate 0.34 0.75 0.74
Interest rate 0.16 0.12 0.19
Step 9, averaging the cluster II to obtain a centroid IV;
step 10, judging whether the iteration number J is equal to J +1, if so, executing step 8, and if so, executing step 11;
step 11, constructing a local matrix through a row vector clustering result (cluster one) and a column vector clustering result (cluster two), wherein the local matrix constructed by the row vector cluster of attached table 4 and the column vector cluster of attached table 9 is shown in attached table 11, and the local matrix constructed by the row vector cluster of attached table 4 and the column vector cluster of attached table 10 is shown in attached table 12;
attached watch 11
Figure BDA0002821820830000091
Attached table 12
Figure BDA0002821820830000092
And step 12, filling the local matrix through a potential factor model, wherein the potential factor model is A ═ UVTWhere a is the local model, U and V are the potential factor matrices for the user and attribute, respectively, the number of rows is the number of users and attribute, respectively, the number of columns is the potential factor dimension, in this example, the potential factor dimension is 3, as exemplified by the attached table 11, with the formula a ═ UVTFor potential vector U8 for user 8 and potential vector V3 on attribute "province", U8 is found to be [32.94, 48.43, 10.14 ", respectively]The characteristic 'province' V3 is [0.22, 0.04, 3.24 ]]Thus can be determined by the formula UVTGet the missing value of user 8 on attribute "province" as U8V3 ═ aT42, a partial matrix a' without missing values can be obtained, and all partial matrices obtained in step 11 are filled;
step 13, filling a data matrix with the result of the non-missing value local matrix obtained in the step 12;
and step 14, outputting the data matrix, which is shown in an attached table 13.
Attached watch 13
Figure BDA0002821820830000101
As can be known from the attached table 13, the bidirectional clustering method for wind control system data completion provided by the invention can stably supplement missing data, and has an important role in supplementing massive missing data at the present stage.
Data set open experimental effect comparison:
the public data set consisted of 656, 724 loan records published by "Lendingclub" between 2013 and 2015. There are 115 attribute description loan applications in total. The "loan status" attribute that describes the current status of the loan has the following value: "issued", "current", "paid full", "default", "received", "delayed (16-30 days)", "late (31-120 days)" and "in grace period". These states are used to reduce them to binary classification problems, i.e., loan applications with "charged", "default", "delayed (31-120 days)" and "delayed (16-30 days)" are considered "bad" or "default" loans, while "current", "paid full" and "in grace period" are classified as "bad" loans, the rest are ignored. A value of 0 indicates a good reputation and a value of 1 indicates a bad reputation or a default. The loan amounts vary from $1000 to $35,000, with each loan having a "rank" (from A-G to A) associated with it. The ratings specify interest rates in order of small to large, ranging from 5.32% to 29%. The results indicate that loans with higher interest rates have a higher risk of default. A G-rated loan accounts for 31% of the loans, while only 3% of the A-rated loans are bad loans. In the data set, the comparison of the performance of the algorithm is evaluated by the AUC, and the accuracy of the algorithm with high AUC is higher.
For comparison, the applicant considered the following methods as comparative references:
offset: offset is widely used for reference testing of prediction accuracy, using an average value of all user data of an item as a prediction value.
ItemKNN: ItemKNN clusters the attributes of the user into a plurality of subsets and uses the average of each subset as a predictor.
MF: matrix Factorization (Matrix Factorization) is a potential factor model. The method is widely applied to wind control systems.
ADFT: an Alternative Distance Function Transformation (Alternative Distance Function Transformation) learns a Distance Function using constraints that must be linked and cannot be linked between instances, and computes a Transformation matrix using the Distance Function, thereby generating an Alternative cluster using a set of features.
MSC: stable multi-clustering (Multiple Stable clustering) uses simplex constraints to generate different sparse weights assigned to features, and then uses spectral clustering to produce Multiple Stable clusters.
MetaClustering: meta-clustering is a well-known method in the unsupervised multi-clustering category. It first gives different weights to the features according to the Zipf distribution, and then obtains multiple clusters by applying the k-means to the weighted features.
The method of this scheme is denoted DCM.
The experimental results are as follows: the results of the experiment are shown in table one, which illustrates the performance of the method of this protocol and other baseline methods in terms of AUC. The results show that the proposed DCM achieves better performance.
TABLE 3
Offset ItemKNN MF ADFT MSC MetaClustering DCM
AUC 66.80% 77.79% 79.69% 84.55% 87.97% 88.22% 92.09%
Visual experiment effect comparison: to further illustrate the performance of the method of the present scheme, the present scheme further shows visually, as shown in fig. 4, by comparing images obtained by filling clusters obtained by clustering ItemKNN and DCM, it can be seen that, when the number of clusters is the same, the ItemKNN is inferior to the DCM in terms of the expression of features, because the DCM utilizes information of two dimensions for clustering.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims (3)

1. A bidirectional clustering method for wind control system data completion is characterized by comprising five steps of example clustering, attribute clustering, local matrix construction, local matrix filling and matrix filling, wherein:
the example clustering is to distribute the sample points into different clusters I, the mass center of each cluster I is different, the mass center of each cluster I is obtained by updating a first formula, the similarity in the example clustering is calculated by a first distance calculation formula, and the first distance calculation formula is
Figure FDA0002821820820000011
Wherein D represents the number of attributes of the data object, and the formula of the subset c allocated to the cluster in the example cluster is
Figure FDA0002821820820000012
The attribute clustering is to perform attribute dimension clustering on the data obtained by example clustering and distribute the data into different clusters II, wherein the mass centers of the clusters II are different, and the qualities of the clusters II are differentThe center is obtained by updating a formula II, the similarity in the attribute clusters is calculated by a distance formula I, and the formula of the subset d distributed by the clusters in the attribute clusters is shown as
Figure FDA0002821820820000013
The local matrix is constructed to carry out united clustering on example clustering and attribute clustering to obtain a local matrix;
the local matrix filling is to fill the local matrix with a potential factor model according to the relevance of the user and the item to obtain a complete matrix, wherein the potential factor model is A ═ UVTWherein A is a local model, and U and V are potential factor matrixes of users and characteristic items respectively;
and the matrix filling is to fill the filled local matrix into the matrix to obtain a complete matrix.
2. The bidirectional clustering method oriented to wind control system data completion of claim 1, wherein both the first update formula and the second update formula are the same
Figure FDA0002821820820000014
Wherein, the CenterkDefined as the centroid of the kth cluster, CenterkRepresents the kth class cluster, | CkAnd | represents the number of data objects in the kth class cluster.
3. The bidirectional clustering method oriented to wind control system data completion of claim 2, characterized in that a Center is calculatedkAnd then, selecting the point closest to the centroid from the sample points, and updating the point to be the new centroid.
CN202011439471.4A 2020-12-07 2020-12-07 Bidirectional clustering method for wind control system data completion Pending CN112488228A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011439471.4A CN112488228A (en) 2020-12-07 2020-12-07 Bidirectional clustering method for wind control system data completion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011439471.4A CN112488228A (en) 2020-12-07 2020-12-07 Bidirectional clustering method for wind control system data completion

Publications (1)

Publication Number Publication Date
CN112488228A true CN112488228A (en) 2021-03-12

Family

ID=74939966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011439471.4A Pending CN112488228A (en) 2020-12-07 2020-12-07 Bidirectional clustering method for wind control system data completion

Country Status (1)

Country Link
CN (1) CN112488228A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788538A (en) * 2024-02-27 2024-03-29 南京信息工程大学 Registration method, device and system for consistency of point cloud interval pairing volume variances

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488662A (en) * 2013-04-01 2014-01-01 哈尔滨工业大学深圳研究生院 Clustering method and system of parallelized self-organizing mapping neural network based on graphic processing unit
CN105513370A (en) * 2015-12-29 2016-04-20 浙江大学 Traffic zone dividing method based on sparse vehicle license identification data
CN105955975A (en) * 2016-04-15 2016-09-21 北京大学 Knowledge recommendation method for academic literature
CN106484876A (en) * 2016-10-13 2017-03-08 中山大学 A kind of based on typical degree and the collaborative filtering recommending method of trust network
US20170235823A1 (en) * 2013-09-12 2017-08-17 Guangdong Electronics Industry Institute Ltd. Clustering method for multilingual documents
CN107124265A (en) * 2017-04-28 2017-09-01 淮安纷云软件有限公司 A kind of identity identifying method based on Hash hash tables
CN107833153A (en) * 2017-12-06 2018-03-23 广州供电局有限公司 A kind of network load missing data complementing method based on k means clusters
CN111812195A (en) * 2020-07-31 2020-10-23 江南大学 Method for classifying circumferential angles of pipeline defects obtained by eddy current testing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488662A (en) * 2013-04-01 2014-01-01 哈尔滨工业大学深圳研究生院 Clustering method and system of parallelized self-organizing mapping neural network based on graphic processing unit
US20170235823A1 (en) * 2013-09-12 2017-08-17 Guangdong Electronics Industry Institute Ltd. Clustering method for multilingual documents
CN105513370A (en) * 2015-12-29 2016-04-20 浙江大学 Traffic zone dividing method based on sparse vehicle license identification data
CN105955975A (en) * 2016-04-15 2016-09-21 北京大学 Knowledge recommendation method for academic literature
CN106484876A (en) * 2016-10-13 2017-03-08 中山大学 A kind of based on typical degree and the collaborative filtering recommending method of trust network
CN107124265A (en) * 2017-04-28 2017-09-01 淮安纷云软件有限公司 A kind of identity identifying method based on Hash hash tables
CN107833153A (en) * 2017-12-06 2018-03-23 广州供电局有限公司 A kind of network load missing data complementing method based on k means clusters
CN111812195A (en) * 2020-07-31 2020-10-23 江南大学 Method for classifying circumferential angles of pipeline defects obtained by eddy current testing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
喻金平: "基于混合蛙跳联合聚类的协同过滤算法", 《微电子学与计算机》 *
毕猛: "一种用于网络用户行为聚类的标签自动生成方法", 《计算机工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788538A (en) * 2024-02-27 2024-03-29 南京信息工程大学 Registration method, device and system for consistency of point cloud interval pairing volume variances
CN117788538B (en) * 2024-02-27 2024-05-10 南京信息工程大学 Registration method, device and system for consistency of point cloud interval pairing volume variances

Similar Documents

Publication Publication Date Title
CN113468227B (en) Information recommendation method, system, equipment and storage medium based on graph neural network
CN113435509B (en) Small sample scene classification and identification method and system based on meta-learning
CN111325248A (en) Method and system for reducing pre-loan business risk
CN111461225B (en) Customer clustering system and method thereof
CN111539444A (en) Gaussian mixture model method for modified mode recognition and statistical modeling
CN112819523B (en) Marketing prediction method combining inner/outer product feature interaction and Bayesian neural network
CN113657678A (en) Power grid power data prediction method based on information freshness
CN111861756A (en) Group partner detection method based on financial transaction network and implementation device thereof
CN112418476A (en) Ultra-short-term power load prediction method
CN111611293B (en) Outlier data mining method based on feature weighting and MapReduce
CN113822419A (en) Self-supervision graph representation learning operation method based on structural information
CN115410199A (en) Image content retrieval method, device, equipment and storage medium
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
CN112488228A (en) Bidirectional clustering method for wind control system data completion
Zhu et al. Loan default prediction based on convolutional neural network and LightGBM
CN107423319B (en) Junk web page detection method
CN109271555A (en) Information cluster method, system, server and computer readable storage medium
CN111967973A (en) Bank client data processing method and device
CN116541792A (en) Method for carrying out group partner identification based on graph neural network node classification
CN115564578B (en) Fraud recognition model generation method
CN114298245A (en) Anomaly detection method and device, storage medium and computer equipment
CN111275447B (en) Online network payment fraud detection system based on automatic feature engineering
CN113763710A (en) Short-term traffic flow prediction method based on nonlinear adaptive system
CN111784381A (en) Privacy protection and SOM network-based power customer segmentation method and system
CN111984842A (en) Bank client data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210312

RJ01 Rejection of invention patent application after publication