CN110569915A - automobile data clustering method and system based on intuitive fuzzy C-means - Google Patents

automobile data clustering method and system based on intuitive fuzzy C-means Download PDF

Info

Publication number
CN110569915A
CN110569915A CN201910865982.3A CN201910865982A CN110569915A CN 110569915 A CN110569915 A CN 110569915A CN 201910865982 A CN201910865982 A CN 201910865982A CN 110569915 A CN110569915 A CN 110569915A
Authority
CN
China
Prior art keywords
matrix
membership
center
clustering
cluster center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910865982.3A
Other languages
Chinese (zh)
Other versions
CN110569915B (en
Inventor
耿玉水
王菲
张焕颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN201910865982.3A priority Critical patent/CN110569915B/en
Publication of CN110569915A publication Critical patent/CN110569915A/en
Application granted granted Critical
Publication of CN110569915B publication Critical patent/CN110569915B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

The invention discloses an automobile data clustering method and system based on an intuitive fuzzy C mean value, wherein a set parameter and an automobile data set characteristic value matrix are input; carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix; defining a density parameter and determining an initial clustering center; judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation step, and otherwise, entering an output step; calculating a membership matrix: calculating a membership matrix; updating the clustering center by using the membership matrix; judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output step, and if not, returning to the membership degree matrix calculation step; an output step: and outputting the membership matrix and the clustering center to obtain an automobile data clustering result.

Description

Automobile data clustering method and system based on intuitive fuzzy C-means
Technical Field
The disclosure relates to the technical field of automobile data clustering, in particular to an automobile data clustering method and system based on an intuitive fuzzy C mean value.
background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
In the course of implementing the present disclosure, the inventors found that the following technical problems exist in the prior art:
With the popularization of intelligent devices and the prosperity of the internet and the internet of things, data is increased explosively, and the development of the automobile industry is more and more noticed by people. Vehicles generate a large amount of data during use and operation, such as the state of the battery and the motor. However, how to utilize a large amount of data generated by the automobile to perform tasks such as knowledge mining, machine error retrieval and the like becomes one of the difficulties in the industry.
disclosure of Invention
in order to overcome the defects of the prior art, the invention provides an automobile data clustering method and system based on an intuitive fuzzy C mean value; the fuzzy C-means algorithm FCM is improved by using intuitive fuzzy entropy, and an initial clustering center is selected by introducing density parameters, so that the method is an important means for solving the problem that the fuzzy C-means algorithm is easy to fall into local optimization;
In a first aspect, the present disclosure provides an intuitive fuzzy C-means based automobile data clustering method;
The automobile data clustering method based on the intuitive fuzzy C mean value comprises the following steps:
Inputting a set parameter and an automobile data set characteristic value matrix;
Carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
defining a density parameter and determining an initial clustering center;
judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation step, and otherwise, entering an output step;
Calculating a membership matrix: calculating a membership matrix;
Updating the clustering center by using the membership matrix;
Judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output step, and if not, returning to the membership degree matrix calculation step;
an output step: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
in a second aspect, the present disclosure also provides an automobile data clustering system based on the intuitive fuzzy C-means;
Automobile data clustering system based on intuitionistic fuzzy C mean value includes:
An input module configured to: inputting a set parameter and an automobile data set characteristic value matrix;
A weighting module configured to: carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
An initial cluster center determination module configured to: defining a density parameter and determining an initial clustering center;
A first determination module configured to: judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation module, and otherwise, entering an output module;
A membership matrix calculation module configured to: calculating a membership matrix;
An update module configured to: updating the clustering center by using the membership matrix;
a second determination module configured to: judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output module, and if not, returning to a membership matrix calculation module;
An output module configured to: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
In a third aspect, the present disclosure also provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and executed on the processor, wherein the computer instructions, when executed by the processor, perform the steps of the method of the first aspect.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the steps of the method of the first aspect.
Compared with the prior art, the beneficial effect of this disclosure is:
Aiming at the problem that the traditional algorithm is sensitive to the selection of the initial clustering center, the density area is divided, the initial clustering center is selected in the high density area, and the noise point of the low density area is avoided. And (3) introducing the characteristic weight of the intuitive fuzzy entropy calculation data set, weighting the characteristic value, and considering the influence of the characteristic weight on the clustering result.
The clustering method of the fuzzy C mean value based on the intuitive fuzzy set improves the traditional FCM algorithm under the intuitive fuzzy environment, selects the initial clustering center by defining the area density parameter, and can avoid the noise in the data set; the fuzzy entropy is used for removing the characteristic weight weighting, so that the fuzzy degree of the fuzzy set is more accurately reflected, and the clustering effect and the time complexity of the algorithm are improved. The automobile is divided into 3 types according to different characteristics, the automobile can be used as one of references when being purchased, the association degree between the automobile data can be accurately judged by combining the clustering analysis result, and the problems of the automobile in the aspects of energy consumption, fault analysis and the like can be found by operators and producers in time.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a flow chart of the method of the first embodiment.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiment I provides an automobile data clustering method based on an intuitive fuzzy C mean value;
as shown in fig. 1, the method for clustering automobile data based on the intuitive fuzzy C-means includes:
s1: inputting a set parameter and an automobile data set characteristic value matrix;
s2: carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
S3: defining a density parameter and determining an initial clustering center;
s4: judging whether the iteration times are smaller than a set iteration threshold, if so, entering S5, otherwise, entering step S8;
S5: calculating a membership matrix;
S6: updating the clustering center by using the membership matrix;
S7: judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, go to S8, if no, return to S5;
s8: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
as one or more embodiments, in S1, the setting parameters include: clustering number c, ambiguity parameter m, threshold epsilon for stopping iteration and iteration times t; the automobile data set characteristic value matrix G comprises: the fuel consumption of each automobile, the friction coefficient of automobile tires, the price of the automobile, the comfort level of the automobile and the safety factor of the automobile.
It should be understood that, in said S1,A vehicle data set characteristic value matrix G shown in Table 1; for five models of automobiles Xi(i 1, 2.., 5.) each car contains 6 attributes Q ═ Q1,q2,...,q6wherein q is1for the fuel consumption of the vehicle, q2Is the coefficient of friction of the vehicle tyre, q3For the price of the car, q4For comfort, q5for aesthetic reasons, q6the safety factor is. The automobile data set characteristic value matrix G is a 5 x 6 matrix composed of 6 attribute values respectively corresponding to five types of automobiles.
it should be understood that the matrix of eigenvalues after weighting is obtained is shown in table 2;
As one or more embodiments, in S2, the step of performing weighted calculation on the eigenvalue matrix G of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted eigenvalue matrix includes:
let the set of cars X ═ X1,x2,…xn},A={<xiA(xi),vA(xi)>|xie X is the set of intuitive ambiguities on X,
wherein the content of the first and second substances,fA(x)=1-|μA(x)-vA(x)|,πA(x)=1-μA(x)-vA(x) A represents an intuitive fuzzy set of relevant vehicle parameters,Indicating the intuitive fuzzy entropy of the j-th column. E (A) is the intuitive fuzzy entropy of A, fA(xi) Is an index x of a parameter of the automobileiAmbiguity in A,. piA(xi) Is xiDegree of hesitation in A, deltaA(x) Is the degree of ambiguity of the number of intervals, ωjIs a weight calculation formula, mu, of a matrix G of eigenvalues of an automobile data setA(x) Is degree of membership, vA(x) Is the cluster center;
calculating a weighted eigenvalue matrix G':
g′ij=ωjgij=<1-(1-μij)ωj,(vij)ωj>;
Wherein, the weighted eigenvalue matrix G '═ G'ij)n×s;gijIs the eigenvalue of the matrix, μijis degree of membership, vijIs the cluster center.
Weighting the feature weight by using a new intuitive fuzzy entropy, wherein the larger the fuzzy entropy E (A) value is, the higher the fuzzy degree and the uncertainty degree of the set are, and the smaller the weight of the feature is; conversely, the smaller the blur entropy E (a) value, the greater the weight of the feature.
It should be understood that the S3: defining density parameters, and determining an initial clustering center shown in a table 3;
As one or more embodiments, the defining a density function, determining an initial cluster center; the method comprises the following specific steps:
Let t equal to 0, determine initial clustering center V(t)
calculating a feature vector G'iThe density of the region is defined as the region density parameter rhoiin the order of Gi'As center, n Euclidean distances d (G'i,G′1),d(G′i,G′2),…,d(G′i,G′n) And reorder them, order
d(G′i,G′1)≤d(G′i,G′2)≤…≤d(G′i,G′n)
Wherein d (G'i,G′1) Is the first region G'1and a feature vector G'iEuclidean distance measurement of the located area; d (G'i,G′2) Is a second region G'2And a feature vector G'ieuclidean distance measurement of the located area; d (G'i,G′n) Is the n-th region G'nAnd a feature vector G'iThe euclidean distance metric of the region in which it is located.
Due to d (G'i,G′i) Is 0, so after reordering, will contain G'iMinimum Euclidean distance of the N feature vectors within, denoted as R (G'i);
R(G′i)=d(G′i,G′(N))
Wherein, G'iIs the feature vector of the density of the region; g'(N)Is G'ithe Nth feature vector of the region; 0<N<n is an integer.
G′iRegion G 'of'(1),G′(2),…,G′(N)Total N feature vectors, G'iRegion density parameter ρiIs composed of
Wherein R (G'i) Is of G'iMinimum euclidean distance of the inner N feature vectors.
From R (G'i) And ρiCalculating the region density parameter of the feature vector by the formula (2), and comparing to obtain the G 'with the maximum region density'iAs the first cluster center V1Obtaining a characteristic vector set P ═ G 'of a high-density region'(1),G′(2),…,G′(N)Get the distance V in P1The farthest feature vector is used as the second cluster center V2Then all the feature vectors in P are calculated to V1Then all the feature vectors in P are calculated to V2P, a third cluster center V satisfying the following condition is obtained from P3
max(min(d(G′(r),V1),d(G′(r),V2))),r=1,2,…,N
by taking the distance V from the first cluster center1And a second polyClass center V2The feature vectors with the farthest distances are used as a third clustering center;
Finally, all feature vectors in P are calculated to V1,V2,…,Vk-1Taken out of P by a distance of
max(min(d(G′(r),V1),d(G′(r),V2),…,d(G′(r),Vk-1))),r=1,2,…,N
As the k-th cluster center Vk(k is 1,2, …, c), and a cluster center set V is obtained in this order as { V ═ V }1,V2,…,Vc}。
Under the large intuitive fuzzy environment, the density subdivision can be carried out on the selection area of the initial clustering center, noise points in a low-density area are avoided, the initial clustering center is only selected in a high-density area, and the influence on the clustering result possibly caused by improper selection of the initial value is effectively avoided.
As one or more embodiments, the step S4 includes:
Judging whether t is smaller than an iteration threshold value delta, if so, continuing to S5; if not, it jumps to step S8.
As one or more embodiments, S5: calculating membership degree matrixes shown in tables 4 and 6;
as one or more embodiments, the step S5 includes:
Calculating membership degree matrix U(t),U(t)=(uik)c×l
When l is more than or equal to 1 and less than or equal to c, let d (G'i,Vl) When the value is equal to 0, then
When any l is 1,2, …, c, there is d (G'i,Vl)>0;
and m is an ambiguity parameter.
Wherein: u. ofikthe degree to which a sample belongs to the cluster center; u is uikForming a membership matrix; d is an ambiguity matrix; g is the sample variance; v is a clustering center matrix formed by k clustering centers; d (G'i,Vl) Is a feature vector G' and a cluster center VlThe Euclidean distance between, G' is a matrix of eigenvalues, VlIs the ith cluster center.
As will be appreciated, S6: updating the clustering centers by using the membership matrix, which is shown in tables 5 and 7;
As one or more embodiments, the step S6 includes:
Updating clustering center V by using membership matrix(t)Wherein the k-th cluster center is marked as Vk
Vk={vk1,vk2,…,vks}
in the formula, vkj=<αkjkj>,
Wherein v isksIs a vector of eigenvalues, alpha, in an eigenvalue matrixkjAnd betakjAre all intuitive fuzzy numbers of cluster centers, uikIs a membership matrix, muijIs uiCorresponding to vjDegree of membership, vijIs uiCorresponding to vjM is an ambiguity parameter.
as one or more embodiments, the step S7 includes:
The sum of squared Euclidean distances of the dataset with respect to the cluster center isJudgment J (U)(t-1),V(t-1))-J(U(t),V(t))<If epsilon is true, entering S8, otherwise jumping to S5;
wherein, U(t-1)is the degree of membership, V, at the iteration number t-1(t-1)Is degree of membership of U(t-1)Cluster center of time, U(t)is the degree of membership, V, at the iteration number t(t)is degree of membership of U(t)Cluster center of time, d (G'i,Vk) Is a feature vector G'iAnd a cluster center Vkthe Euclidean distance between, G' is a matrix of eigenvalues, VlIs the ith cluster center and epsilon is the iteration threshold.
As will be appreciated, S8: and outputting a membership matrix and a clustering center, and solving a clustering result according to a maximum membership principle, which is shown in a table 9.
As one or more embodiments, the step S8 includes:
Outputting a membership matrix U and a clustering center V, and solving a clustering result, x, according to a maximum membership principle1,x2,x3the vehicles of the three models belong to the same cluster or have the same clusters, and the vehicles of the other two models belong to other clusters respectively.
in real-life applications, it makes sense for an instance to share some common points of multiple clusters. For example, volvo is often known for its high safety, but at the same time, its price is relatively high, so that volvo can be grouped into both safe and luxury vehicles. This result is instructive to the customer's desire for car purchases.
TABLE 1 automobile sample data set G
TABLE 2 calculation of feature weights to obtain weighted sample eigenvalue matrix
TABLE 3 initial clustering center
TABLE 4 membership matrix corresponding to initial clustering center
TABLE 5 clustering center when t 1
TABLE 6 membership matrix when t is 1
Table 7 clustering center when t is 2
TABLE 8 membership matrix when t is 2
table 9 clustering results of this disclosure on automotive data sets
In the second embodiment, the embodiment provides an automobile data clustering system based on an intuitive fuzzy C mean value;
Automobile data clustering system based on intuitionistic fuzzy C mean value includes:
An input module configured to: inputting a set parameter and an automobile data set characteristic value matrix;
A weighting module configured to: carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
An initial cluster center determination module configured to: defining a density parameter and determining an initial clustering center;
A first determination module configured to: judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation module, and otherwise, entering an output module;
A membership matrix calculation module configured to: calculating a membership matrix;
an update module configured to: updating the clustering center by using the membership matrix;
a second determination module configured to: judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output module, and if not, returning to a membership matrix calculation module;
An output module configured to: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
the present disclosure also provides an electronic device, which includes a memory, a processor, and a computer instruction stored in the memory and executed on the processor, where when the computer instruction is executed by the processor, each operation in the method is completed, and details are not described herein for brevity.
The electronic device may be a mobile terminal and a non-mobile terminal, the non-mobile terminal includes a desktop computer, and the mobile terminal includes a Smart Phone (such as an Android Phone and an IOS Phone), Smart glasses, a Smart watch, a Smart bracelet, a tablet computer, a notebook computer, a personal digital assistant, and other mobile internet devices capable of performing wireless communication.
It should be understood that in the present disclosure, the processor may be a central processing unit CPU, but may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The steps of a method disclosed in connection with the present disclosure may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here. Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a division of one logic function, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
the above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. The automobile data clustering method based on the intuitive fuzzy C mean value is characterized by comprising the following steps:
Inputting a set parameter and an automobile data set characteristic value matrix;
Carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
Defining a density parameter and determining an initial clustering center;
Judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation step, and otherwise, entering an output step;
Calculating a membership matrix: calculating a membership matrix;
Updating the clustering center by using the membership matrix;
Judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output step, and if not, returning to the membership degree matrix calculation step;
An output step: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
2. The method of claim 1, wherein said setting parameters comprises: clustering number c, ambiguity parameter m, threshold epsilon for stopping iteration and iteration times t; the automobile data set characteristic value matrix G comprises: the fuel consumption of each automobile, the friction coefficient of automobile tires, the price of the automobile, the comfort level of the automobile and the safety factor of the automobile.
3. The method as claimed in claim 1, wherein the weighting calculation is performed on the eigenvalue matrix G of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted eigenvalue matrix, and the specific steps are as follows:
let the set of cars X ═ X1,x2,…xn},A={<xiA(xi),vA(xi)>|xiE X is the set of intuitive ambiguities on X,
Wherein the content of the first and second substances,fA(x)=1-|μA(x)-vA(x)|,πA(x)=1-μA(x)-vA(x) A represents an intuitive fuzzy set of relevant vehicle parameters,the intuitive fuzzy entropy of the j column is represented; e (A) is the intuitive fuzzy entropy of A, fA(xi) Is an index x of a parameter of the automobileiAmbiguity in A,. piA(xi) Is xidegree of hesitation in A, deltaA(x) Is the degree of ambiguity of the number of intervals, ωjIs a weight calculation formula, mu, of a matrix G of eigenvalues of an automobile data setA(x) Is degree of membership, vA(x) Is the cluster center;
Calculating a weighted eigenvalue matrix G':
g′ij=ωjgij=<1-(1-μij)ωj,(vij)ωj>;
Wherein, the weighted eigenvalue matrix G '═ G'ij)n×s;gijIs the eigenvalue of the matrix, μijis degree of membership, vijIs the cluster center.
4. The method of claim 1, wherein the defining a density function determines an initial cluster center; the method comprises the following specific steps:
let t equal to 0, determine initial clustering center V(t)
Computing a feature vector Gi' Density of region in which region, define region Density parameter ρiIn the order of Gi' As a center, n Euclidean distances d (G) are calculatedi′,G1′),d(Gi′,G2′),…,d(Gi′,Gn') and reorder them, order
d(Gi′,G1′)≤d(Gi′,G2′)≤…≤d(Gi′,Gn′)
Wherein d (G)i′,G1') is a first region G1' AND feature vector Gi' Euclidean distance measure of the region; d (G)i′,G2') is a second region G2' AND feature vector Gi' Euclidean distance measure of the region; d (G)i′,Gn') is the nth region Gn' AND feature vector Gi' Euclidean distance measure of the region;
due to d (G)i′,Gi') is 0, so after reordering, G will be includedi' minimum Euclidean distance of N feature vectors, denoted as R (G)i′);
R(Gi′)=d(Gi′,G′(N))
Wherein G isi' is the feature vector of the density of the region; g'(N)Is Gi' Nth feature vector of the located region; 0<N<n is an integer;
Gi' region G ' of '(1),G′(2),…,G′(N)Total N feature vectors, Gi' region density parameter ρiis composed of
wherein R (G)i') is a group containing Gi' minimum euclidean distance of the N feature vectors inside;
Using R (G)i') and ρicalculating the area density parameter of the feature vector by the formula, and selecting the G with the maximum area density after comparisoni' as the first clustering center V1obtaining a characteristic vector set P ═ G 'of a high-density region'(1),G′(2),…,G′(N)Get the distance V in P1The farthest feature vector is used as the second cluster center V2Then all the feature vectors in P are calculated to V1Then all the feature vectors in P are calculated to V2is satisfied by PThird clustering center V of the lower condition3
max(min(d(G′(r),V1),d(G′(r),V2))),r=1,2,…,N
by taking the distance V from the first cluster center1And a second cluster center V2The feature vectors with the farthest distances are used as a third clustering center;
Finally, all feature vectors in P are calculated to V1,V2,…,Vk-1taken out of P by a distance of
max(min(d(G′(r),V1),d(G′(r),V2),…,d(G′(r),Vk-1))),r=1,2,…,N
as the k-th cluster center Vk(k is 1,2, …, c), and a cluster center set V is obtained in this order as { V ═ V }1,V2,…,Vc}。
5. The method of claim 1, wherein the membership matrix calculating step comprises the specific steps of:
Calculating membership degree matrix U(t),U(t)=(uik)c×l
When l is more than or equal to 1 and less than or equal to c, let d (G'i,Vl) When the value is equal to 0, then
When any l is 1,2, …, c, there is d (G'i,Vl)>0;
m is an ambiguity parameter;
Wherein: u. ofikthe degree to which a sample belongs to the cluster center; u is uikForming a membership matrix; d is an ambiguity matrix; g is the sample variance; v is a clustering center matrix formed by k clustering centers; d (G'i,Vl) Is characterized in thatvector G' and clustering center VlThe Euclidean distance between, G' is a matrix of eigenvalues, VlIs the ith cluster center.
6. the method of claim 1, wherein the step of updating the cluster center using the membership matrix comprises:
Updating clustering center V by using membership matrix(t)Wherein the k-th cluster center is marked as Vk
Vk={vk1,vk2,…,vks}
In the formula, vkj=<αkjkj>,
wherein v isksIs a vector of eigenvalues, alpha, in an eigenvalue matrixkjAnd betakjAre all intuitive fuzzy numbers of cluster centers, uikIs a membership matrix, muijIs uicorresponding to vjdegree of membership, vijIs uiCorresponding to vjM is an ambiguity parameter.
7. The method of claim 1, wherein said step of determining whether the difference between the sum of squared euclidean distances of the previous time data set to the cluster center and the sum of squared euclidean distances of the current time data set to the cluster center is less than a predetermined threshold comprises:
The sum of squared Euclidean distances of the dataset with respect to the cluster center isJudgment J (U)(t-1),V(t-1))-J(U(t),V(t))<If the epsilon is established, entering an output step, otherwise, skipping to a membership matrix calculation step;
Wherein, U(t-1)Is the degree of membership, V, at the iteration number t-1(t-1)Is degree of membership of U(t-1)Cluster center of time, U(t)Is the degree of membership, V, at the iteration number t(t)is degree of membership of U(t)Cluster center of time, d (G)i′,Vk) Is a feature vector Gi' and clustering center VkThe Euclidean distance between, G' is a matrix of eigenvalues, VlIs the ith cluster center and epsilon is the iteration threshold.
8. Automobile data clustering system based on intuitionistic fuzzy C mean value is characterized by comprising:
An input module configured to: inputting a set parameter and an automobile data set characteristic value matrix;
A weighting module configured to: carrying out weighting calculation on the characteristic value matrix of the automobile data set by using the improved intuitive fuzzy entropy to obtain a weighted characteristic value matrix;
An initial cluster center determination module configured to: defining a density parameter and determining an initial clustering center;
A first determination module configured to: judging whether the iteration times are smaller than a set iteration threshold value, if so, entering a membership matrix calculation module, and otherwise, entering an output module;
a membership matrix calculation module configured to: calculating a membership matrix;
An update module configured to: updating the clustering center by using the membership matrix;
A second determination module configured to: judging whether the difference value between the Euclidean distance square sum of the data set at the previous moment relative to the cluster center and the Euclidean distance square sum of the data set at the current moment relative to the cluster center is smaller than a set threshold value or not; if yes, entering an output module, and if not, returning to a membership matrix calculation module;
An output module configured to: and outputting a membership matrix and a clustering center, and solving an automobile data clustering result according to a maximum membership principle.
9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executable on the processor, the computer instructions when executed by the processor performing the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method of any one of claims 1 to 7.
CN201910865982.3A 2019-09-12 2019-09-12 Automobile data clustering method and system based on intuitive fuzzy C-means Active CN110569915B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910865982.3A CN110569915B (en) 2019-09-12 2019-09-12 Automobile data clustering method and system based on intuitive fuzzy C-means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910865982.3A CN110569915B (en) 2019-09-12 2019-09-12 Automobile data clustering method and system based on intuitive fuzzy C-means

Publications (2)

Publication Number Publication Date
CN110569915A true CN110569915A (en) 2019-12-13
CN110569915B CN110569915B (en) 2022-04-01

Family

ID=68779773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910865982.3A Active CN110569915B (en) 2019-09-12 2019-09-12 Automobile data clustering method and system based on intuitive fuzzy C-means

Country Status (1)

Country Link
CN (1) CN110569915B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324676A (en) * 2020-02-18 2020-06-23 南京华剑兵科智能装备有限公司 Mechanical equipment lubricating oil on-line monitoring system based on fuzzy C-means clustering algorithm
CN113688926A (en) * 2021-08-31 2021-11-23 济南大学 Website behavior classification method, system, storage medium and equipment
CN115034611A (en) * 2022-06-09 2022-09-09 武汉理工大学 Method for evaluating comfort of large-scale marine tourism floating type complex

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2128818A1 (en) * 2007-01-25 2009-12-02 Shanghai Yaowei Industry Co, Ltd. Method of moving target tracking and number accounting
CN109145921A (en) * 2018-08-29 2019-01-04 江南大学 A kind of image partition method based on improved intuitionistic fuzzy C mean cluster

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2128818A1 (en) * 2007-01-25 2009-12-02 Shanghai Yaowei Industry Co, Ltd. Method of moving target tracking and number accounting
CN109145921A (en) * 2018-08-29 2019-01-04 江南大学 A kind of image partition method based on improved intuitionistic fuzzy C mean cluster

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李婧等: "基于直觉模糊集的模糊C均值聚类改进算法", 《上海大学学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324676A (en) * 2020-02-18 2020-06-23 南京华剑兵科智能装备有限公司 Mechanical equipment lubricating oil on-line monitoring system based on fuzzy C-means clustering algorithm
CN113688926A (en) * 2021-08-31 2021-11-23 济南大学 Website behavior classification method, system, storage medium and equipment
CN113688926B (en) * 2021-08-31 2024-03-08 济南大学 Website behavior classification method, system, storage medium and equipment
CN115034611A (en) * 2022-06-09 2022-09-09 武汉理工大学 Method for evaluating comfort of large-scale marine tourism floating type complex

Also Published As

Publication number Publication date
CN110569915B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN110569915B (en) Automobile data clustering method and system based on intuitive fuzzy C-means
US10338565B1 (en) Methods and apparatus for machine learning predictions and multi-objective optimization of manufacturing processes
WO2021189976A1 (en) Product information pushing method and apparatus, device, and storage medium
CN112529115B (en) Object clustering method and system
WO2002003256A1 (en) Method and system for the dynamic analysis of data
CN111105265A (en) Prediction method and device based on customer information, computer equipment and storage medium
CN111611390B (en) Data processing method and device
CN111309975A (en) Method and system for enhancing attack resistance of graph model
Tomani et al. Parameterized temperature scaling for boosting the expressive power in post-hoc uncertainty calibration
CN113407854A (en) Application recommendation method, device and equipment and computer readable storage medium
CN110245700B (en) Classification model construction method, classification model and object identification method
CN115169809A (en) Smart city evaluation method and device
CN109063120B (en) Collaborative filtering recommendation method and device based on clustering
US10726349B2 (en) Calculating posterior probability of classes
Eftimov et al. Comparing multi-objective optimization algorithms using an ensemble of quality indicators with deep statistical comparison approach
CN112243247A (en) Method and device for determining optimization priority of base station and computing equipment
CN105643944A (en) 3D Printer stability control method and control system
CN113159893B (en) Message pushing method and device based on gate control graph neural network and computer equipment
CN112561569B (en) Dual-model-based store arrival prediction method, system, electronic equipment and storage medium
Chouikhi et al. Improved fuzzy possibilistic C-means (IFPCM) algorithms using Minkowski distance
CN110211638B (en) Gene selection method and device considering gene correlation
CN104809098A (en) Method and device for determining statistical model parameter based on expectation-maximization algorithm
CN113255933A (en) Feature engineering and graph network generation method and device and distributed system
CN114363004B (en) Risk assessment method, risk assessment device, computer equipment and storage medium
Hwang et al. A multi‐objective optimization using distribution characteristics of reference data for reverse engineering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant