CN110610200B - Vehicle and merchant classification method and device, computer equipment and storage medium - Google Patents

Vehicle and merchant classification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN110610200B
CN110610200B CN201910796773.8A CN201910796773A CN110610200B CN 110610200 B CN110610200 B CN 110610200B CN 201910796773 A CN201910796773 A CN 201910796773A CN 110610200 B CN110610200 B CN 110610200B
Authority
CN
China
Prior art keywords
attribute
target
dimension
attribute dimension
target vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910796773.8A
Other languages
Chinese (zh)
Other versions
CN110610200A (en
Inventor
杨笑锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dasou Vehicle Software Technology Co Ltd
Original Assignee
Zhejiang Dasou Vehicle Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dasou Vehicle Software Technology Co Ltd filed Critical Zhejiang Dasou Vehicle Software Technology Co Ltd
Priority to CN201910796773.8A priority Critical patent/CN110610200B/en
Publication of CN110610200A publication Critical patent/CN110610200A/en
Application granted granted Critical
Publication of CN110610200B publication Critical patent/CN110610200B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a vehicle and merchant classification method, a vehicle and merchant classification device, computer equipment and a storage medium, and relates to the technical field of vehicle and merchant classification. The vehicle-merchant classification method comprises the following steps: acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to n attribute dimensions respectively, and n is a positive integer; acquiring a clustering weight of each attribute dimension in the n attribute dimensions according to a relative importance degree coefficient between every two attribute dimensions in the n attribute dimensions; and classifying the target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the feature data in the target feature data sets of the target vehicle merchants. The classification method for the car dealers is obtained through objective data calculation, compared with the prior art, subjective judgment of people is eliminated, and therefore classification accuracy of the car dealers is higher.

Description

Vehicle and merchant classification method and device, computer equipment and storage medium
Technical Field
The present application relates to the technical field of vehicle-merchant classification, and in particular, to a vehicle-merchant classification method, apparatus, computer device, and storage medium.
Background
With the growth of new retail business, the number of two-network car dealers relying on the internet is also rapidly increased, wherein the two-network car dealers refer to car dealers who perform car transactions with customers based on a network transaction platform. In the face of a large number of two-network car dealers, in practical application, the network transaction platform needs to classify the two-network car dealers, so that different operation strategies can be formulated for the two-network car dealers of different types.
In the related technology, an operator of a network transaction platform can obtain feature data of two network operators with different attribute dimensions, then the operator extracts feature data corresponding to the attribute dimensions which the operator personally considers to need to pay attention to from the feature data of the two network operators with different attribute dimensions according to personal industry experience, and classifies the two network operators according to the extracted feature data.
In the related art, the classification of the two-network car dealers needs to depend on the personal industry experience of operators, so the accuracy of the classification of the two-network car dealers is low.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device, and a storage medium for classifying a car dealer, in order to solve the problem of low accuracy in classifying a car dealer on two networks.
In a first aspect, an embodiment of the present application provides a vehicle-dealer classification method, where the method includes:
acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to the n attribute dimensions respectively;
acquiring a clustering weight of each attribute dimension in the n attribute dimensions according to a relative importance degree coefficient between any two attribute dimensions in the n attribute dimensions;
and classifying the plurality of target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the plurality of target vehicle merchants.
In an embodiment of the present application, obtaining a clustering weight of each attribute dimension of the n attribute dimensions according to a relative importance coefficient between every two attribute dimensions of the n attribute dimensions includes:
obtaining a relative importance degree coefficient between each attribute dimension of the n attribute dimensions and a target attribute dimension to obtain n relative importance degree coefficients, wherein the target attribute dimension is any one of the n attribute dimensions;
and multiplying the n relative importance degree coefficients, and calculating the clustering weight of the target attribute dimension according to the product obtained by multiplication.
In an embodiment of the present application, obtaining a clustering weight of each attribute dimension of the n attribute dimensions according to a relative importance coefficient between any two attribute dimensions of the n attribute dimensions includes:
determining a plurality of first-order attribute dimensions according to the n attribute dimensions, wherein the first-order attribute dimensions comprise a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one of the n attribute dimensions;
acquiring a clustering weight of each first-order attribute dimension in m first-order attribute dimensions according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions;
aiming at each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance degree coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions;
and aiming at each first-order attribute dimension, calculating the clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and the first clustering weight of each second-order attribute dimension in the first-order attribute dimension.
In one embodiment of the present application, obtaining a target feature data set of a plurality of target vehicle dealers includes:
acquiring initial characteristic data sets of a plurality of target vehicle merchants, wherein the initial characteristic set of the ith target vehicle merchant comprises the sum of the ith target vehicle merchant and kiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer;
and carrying out normalization processing on the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants to obtain target characteristic data sets of the plurality of target vehicle merchants.
In an embodiment of the present application, the normalizing the feature data in the initial feature data sets of the plurality of target vehicle manufacturers includes:
acquiring an attribute dimension set, wherein the attribute dimension set comprises all attribute dimensions corresponding to initial characteristic data sets of a plurality of target vehicle merchants;
for each attribute dimension in the attribute dimension set, acquiring the number of target sets corresponding to the attribute dimension, wherein the target sets are initial feature data sets containing feature data corresponding to the attribute dimension;
and according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set, performing normalization processing on the feature data in the initial feature data sets of the plurality of target vehicle merchants.
In an embodiment of the present application, according to the number of target sets corresponding to each attribute dimension in the attribute dimension set, normalizing the feature data in the initial feature data sets of a plurality of target vehicle dealers includes:
for each attribute dimension in the attribute dimension set, when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold value, acquiring the median of the feature data corresponding to the attribute dimension contained in the target sets;
and regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the plurality of target vehicle dealers.
In an embodiment of the present application, according to the number of target sets corresponding to each attribute dimension in the attribute dimension set, normalizing the feature data in the initial feature data sets of a plurality of target vehicle dealers includes:
and for each attribute dimension in the attribute dimension set, deleting the feature data corresponding to the attribute dimension in the initial feature data sets of the target vehicle dealers when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the target vehicle dealers is smaller than a missing threshold.
In a second aspect, an embodiment of the present application provides a vehicle and merchant classification device, where the device includes:
the target characteristic acquisition module is used for acquiring target characteristic data sets of a plurality of target vehicle merchants, and the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to the n attribute dimensions respectively;
the clustering weight obtaining module is used for obtaining the clustering weight of each attribute dimension in the n attribute dimensions according to the relative importance degree coefficient between any two attribute dimensions in the n attribute dimensions;
and the classification module is used for classifying the plurality of target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the plurality of target vehicle merchants.
In a third aspect, a computer device is provided, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, implements the steps of the method of the first aspect.
In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of the first aspect described above.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:
the classification method, the classification device, the computer equipment and the storage medium for the car dealers can improve the classification accuracy of the car dealers. A background server (hereinafter referred to as a background server) of the network transaction platform acquires target feature data sets of a plurality of target vehicle merchants, determines a relative importance degree coefficient between every two attribute dimensions according to n attribute dimensions included in the target feature data set of each target vehicle merchant, acquires a clustering weight of each attribute dimension of the n attribute dimensions, and classifies the plurality of target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and feature data in the target feature data sets of the plurality of target vehicle merchants. Therefore, in the embodiment of the application, the background server calculates the clustering weight for each attribute dimension of n attribute dimensions in the target feature data set, the n attribute dimensions can be considered comprehensively, the importance degree of each attribute dimension can be determined, further, the background server combines the feature data in the target feature data set of the target vehicle trader and the clustering weight of the attribute dimension corresponding to each feature data to classify the target vehicle trader, and each attribute dimension influences the classification result of the target vehicle trader through the size of the clustering weight of each attribute dimension in the classification process.
Drawings
Fig. 1 is a schematic diagram of an implementation environment of a vehicle-merchant classification method according to an embodiment of the present application;
fig. 2 is a flowchart of a vehicle-dealer classification method according to an embodiment of the present disclosure;
fig. 3 is a flowchart of another vehicle-dealer classification method according to an embodiment of the present disclosure;
FIG. 4 is a graph illustrating a comparison of accuracy provided by embodiments of the present application;
fig. 5 is a flowchart of another method for calculating a cluster weight of each attribute dimension according to the embodiment of the present application;
FIG. 6 is a flowchart of another method for classifying vehicle dealers according to the embodiment of the present application;
fig. 7 is a block diagram of a vehicle-dealer classification device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
With the increase of new retail business, the number of two-network car dealers is more and more, and in order to facilitate management, the network transaction platform needs to classify the two-network car dealers, so that different operation strategies can be formulated for different types of the two-network car dealers. For example, the core type two-network car dealer has the rights of preferential delivery, rapid payment, inclined allocation of car purchasing cables and the like.
Currently, the classification method for two-network car dealers needs to depend on the working experience of operators of the network transaction platform. Specifically, the operator may obtain feature data corresponding to a plurality of attribute dimensions of the two-network car dealer, then extract, from the plurality of attribute dimensions, feature data corresponding to attribute dimensions that the operator considers to be important by himself/herself, and determine a category to which the two-network car dealer belongs based on the extracted feature data.
When different operators classify the same two-network vehicle dealer, the extracted feature data are different due to different important attribute dimensions considered by different operators, and therefore, classification results obtained by classifying the two-network vehicle dealer by different operators may also be different. Therefore, the two-network operators are classified according to personal industry experience of operators, and the accuracy of the obtained classification result is low.
The embodiment of the application provides a classification method for the car dealers, which can improve the classification accuracy of the car dealers. In the method, a background server calculates a clustering weight for each attribute dimension in n attribute dimensions included in a target feature data set, the n attribute dimensions can be considered comprehensively, the importance degree of each attribute dimension can be determined, and further, the background server combines the feature data in the target feature data set of a target vehicle dealer and the clustering weight of the attribute dimension corresponding to each feature data to classify the target vehicle dealer.
In the following, a brief description will be given of an implementation environment related to the vehicle dealer classification method provided in the embodiment of the present application.
Referring to fig. 1, the implementation environment may include a backend server of a network transaction platform, and an internal structure diagram of the backend server may be as shown in fig. 1. The background server comprises a processor, a memory, a network interface and a database which are connected through a system bus. Wherein the processor of the background server is used for providing computing and control capabilities. The memory of the background server comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the background server is used for storing feature data included in a target feature data set of a plurality of target vehicle merchants. The network interface of the background server is used for being connected and communicated with an external terminal through a network. The computer program is executed by a processor to implement a method of vehicle-to-merchant classification.
It will be understood by those skilled in the art that the structure shown in fig. 1 is a block diagram of only a part of the structure related to the present application, and does not constitute a limitation to the backend server to which the present application is applied, and a specific backend server may include more or less components than those shown in the drawings, or combine some components, or have a different arrangement of components.
Please refer to fig. 2, which shows a flowchart of a vehicle-dealer classifying method according to an embodiment of the present application, where the vehicle-dealer classifying method can be applied to the background server shown in fig. 1. As shown in fig. 2, the method for classifying a vehicle dealer may include the steps of:
step 201, a background server acquires a target feature data set of a plurality of target vehicle merchants.
The target feature data set of each target vehicle dealer comprises feature data of each target vehicle dealer, wherein the feature data correspond to n attribute dimensions respectively, and n is a positive integer.
In one possible implementation, as shown in fig. 3, the process of the background server obtaining the target feature data sets of the plurality of target vehicle dealers may include:
step 301, a background server acquires initial characteristic data sets of a plurality of target vehicle merchants.
Wherein the initial feature set of the ith target vehicle trader comprises the sum of the ith target vehicle trader and kiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer.
The target vehicle merchant can upload the initial characteristic data of the target vehicle merchant and the attribute dimension corresponding to each initial characteristic data to the network transaction platform through a vehicle merchant APP (English: Application, Chinese: Application program) provided by the network transaction platform. The initial characteristic data of different target vehicle manufacturers and the attribute dimension corresponding to each initial characteristic data may be different.
The embodiment of the application exemplarily provides initial feature data sets of five target vehicle merchants, wherein the initial feature data set of the target vehicle merchant 1 may be as shown in table 1, and the feature data corresponding to the target vehicle merchant 1 and the attribute dimension corresponding to the feature data are shown in table 1.
TABLE 1
Figure BDA0002181175250000081
The initial feature data set of the target vehicle dealer 2 may be as shown in table 2, where the feature data corresponding to the target vehicle dealer 2 and the attribute dimension corresponding to the feature data are shown in table 2.
TABLE 2
Figure BDA0002181175250000082
The initial feature data set of the target vehicle dealer 3 may be as shown in table 3, and the feature data corresponding to the target vehicle dealer 3 and the attribute dimension corresponding to the feature data are shown in table 3.
TABLE 3
Figure BDA0002181175250000091
The initial feature data set of the target vehicle dealer 4 may be as shown in table 4, where the feature data corresponding to the target vehicle dealer 4 and the attribute dimension corresponding to the feature data are shown in table 4.
TABLE 4
Figure BDA0002181175250000092
The initial feature data set of the target vehicle dealer 5 may be as shown in table 5, and the feature data corresponding to the target vehicle dealer 5 and the attribute dimension corresponding to the feature data are shown in table 5.
TABLE 5
Figure BDA0002181175250000093
It can be seen that the attribute dimension corresponding to the feature data in the initial feature data set of the target vehicle trader 1 (hereinafter referred to as the attribute dimension of the target vehicle trader) includes a creation thread, a vehicle-handing order number, a sharing number and a follow-up thread. The attribute dimensions of the target car trader 2 include a clue for creation, a car-handing completion order number, a cumulative number of employee logins, a cumulative number of boss logins, and a number of follow-up customers. The attribute dimensions of the target car trader 3 include a clue for creation, a car-delivery completion order number, a cumulative number of employee logins, a number of shares, and a number of follow-up customers. The attribute dimensions of the target vehicle trader 4 comprise clue creation, customer number entry, accumulated employee login number, sharing times, two-dimension code promotion and accumulated login times. The attribute dimensions of the target vehicle dealer 5 include a clue creation, accumulated login times, accumulated number of staff login persons, sharing times and two-dimensional code promotion.
It should be noted that, in the embodiment of the present application, only the attribute dimensions corresponding to the feature data included in the initial feature data sets of the target vehicle dealers 1 to 5 are exemplarily given, and the attribute dimensions corresponding to the feature data in the actual initial feature set of the target vehicle dealers are not limited to the attribute dimensions shown in tables 1 to 5. The attribute dimension corresponding to the feature data in the initial feature set of the actual target vehicle dealer can be determined by combining the actual situation of the target vehicle dealer, and the attribute dimension included in the initial feature set of the target vehicle dealer is not limited in the embodiment of the application.
Step 302, the background server normalizes the feature data in the initial feature data sets of the multiple target vehicle merchants to obtain the target feature data sets of the multiple target vehicle merchants.
Because the number of the feature data in the initial feature data set of each target vehicle dealer and the attribute features corresponding to the feature data may be different, in order to uniformly classify a plurality of target vehicle dealers, the initial feature data set of each target vehicle dealer needs to be processed, so that the attribute dimensions corresponding to the feature data of each target vehicle dealer are the same.
The embodiment of the application exemplarily provides a method for performing normalization processing on feature data in an initial feature data set of a plurality of target vehicle dealers by a background server, and the method may include the following steps:
a1, combine Table 1 to Table 5 into a data set, as shown in Table 6, wherein the horizontal line in Table 6 indicates that there is no data.
TABLE 6
Figure BDA0002181175250000101
A2, obtaining all feature data corresponding to the attribute dimension for each attribute dimension in the multiple attribute dimensions in table 6, and performing normalization processing on all feature data corresponding to the attribute dimension.
The embodiment of the application exemplarily provides two ways to perform normalization processing on all feature data corresponding to the attribute dimension.
The first mode is as follows: and taking the difference value of the maximum value and the minimum value in the feature data corresponding to the attribute dimension as a denominator, taking each feature data as a numerator, and obtaining a score, namely a result of normalization processing of the feature data of the attribute dimension of the corresponding target vehicle-to-vehicle quotient.
The second way is: establishing a normalization processing formula
Figure BDA0002181175250000102
Wherein dmax is the maximum value in the feature data corresponding to the attribute dimension, dmin is the minimum value in the feature data corresponding to the attribute dimension, di is the ith feature data corresponding to the attribute dimension, riAnd normalizing the ith characteristic data corresponding to the attribute dimension.
In the embodiment of the present application, the normalization processing performed by adopting the first method for the attribute dimension of creating a cue is described as follows:
the maximum value of the initial feature data corresponding to the attribute dimension of the creation cue is 20, and the minimum value is 3, and then 20-3-17 is used as a denominator, and then each feature data is used as a numerator to obtain a score. The result of the normalization processing of the initial feature data of the attribute dimension corresponding to the creation cue is 10/17-0.59, the result of the normalization processing corresponding to the target car dealer 2 is 20/17-1.18, the result of the normalization processing corresponding to the target car dealer 3 is 5/17-0.29, the result of the normalization processing corresponding to the target car dealer 4 is 3/17-0.18, and the result of the normalization processing corresponding to the target car dealer 5 is 6/17-0.35.
TABLE 7
Vehicle-to-merchant number/attribute dimension Creating threads
Target vehicle business 1 0.59
Target vehicle business 2 1.18
Target car dealer 3 0.29
Target vehicle dealer 4 0.18
Target vehicle dealer 5 0.35
The normalization processing procedures of the other attribute dimensions are similar to that of the attribute dimensions, and are not described in detail again. Note that the data of the null term in table 6 does not participate in calculation during the normalization operation, and the result after the normalization processing may be directly set to 0 by default.
After normalization, a table 8 can be obtained, and the target feature data sets of the target vehicle traders 1 to 5 are shown in the table 8.
The target feature data set of the target vehicle quotient 1 is {0.59, 1.67, 0.8,0,0,0,0, 0,0, 0}, the target feature data set of the target vehicle quotient 2 is {1.18, 0.67, 0,0, 1.6, 0, 2,0,0, 0}, the target feature data set of the target vehicle quotient 3 is {0.29, 0.67, 1.2, 0, 0.6, 0, 1, 0,0, 0}, the target feature data set of the target vehicle quotient 4 is {0.18,0,0.8,0,0,0,0, 0,7.5,2.5,0}, the target feature data set of the target vehicle quotient 5 is {0.35,0,0.2,0,0.8,0,0,6.5,1.5, 0.5 }, and the target feature data set of the target vehicle quotient 5 is {0.18,0,0.8,0,0, 6.5 }.5 }.0.
TABLE 8
Figure BDA0002181175250000121
The attribute dimensions corresponding to the feature data in the target feature data sets of the plurality of target vehicle dealers are the same.
Step 202, the background server obtains a clustering weight of each attribute dimension of the n attribute dimensions according to a relative importance degree coefficient between any two attribute dimensions of the n attribute dimensions.
As described above, the attribute dimensions corresponding to the feature data in the target feature data sets of the plurality of target vehicle dealers are the same, and n attribute dimensions in the target feature data set may be obtained from the target feature data set of any one target vehicle dealer.
The relative importance coefficient between any two of the n attribute dimensions represents the relative importance between any two attribute dimensions.
Optionally, an Analytic Hierarchy Process (AHP for short) may be used to determine the cluster weight of each attribute feature, and the specific operation Process may include the following steps:
b1, the background server obtains the relative importance coefficient between each attribute dimension and the target attribute dimension in the n attribute dimensions, and each attribute dimension can obtain n relative importance coefficients.
The target attribute dimension is any one of n attribute dimensions. The relative importance degree coefficient between each attribute dimension and the target attribute dimension may be set by a developer according to an empirical value, and then adjusted, as shown in table 9, where n is 5 in table 9, and only 5 attribute dimensions are exemplarily selected, and correspondingly, each attribute dimension may obtain 5 relative importance degree coefficients.
TABLE 9
Figure BDA0002181175250000131
B2, the background server multiplies the n relative importance degree coefficients corresponding to each attribute dimension, and calculates the clustering weight of each attribute dimension according to the product obtained by multiplication.
Calculating a clustering weight WI (English: weight, WI for short) of each attribute dimension, wherein the clustering weight WI is the power of opening n/sum (power of opening n); the term "n-degree of opening" is understood to mean that n-degree of opening is calculated by multiplying n relative importance coefficients of any attribute dimension, as shown in table 10:
watch 10
Multiplication by rows Power of n Clustering weight WI AWI AWI/WI
Creating threads 2.25 1.12 0.1847 1.13 6.12
Number of orders for vehicle delivery 2.25 1.12 0.1847 1.13 6.12
Number of shares 2.25 1.12 0.1847 1.13 6.12
Follow-up thread 2.25 1.12 0.1847 1.13 6.12
Accumulating the number of persons registered 0.20 0.79 0.1305 0.75 5.77
Two-dimensional code sharing 0.20 0.79 0.1305 0.75 5.77
Taking the attribute dimension of creating the clue in table 9 as an example, the operation process of "multiplication by rows" is: 1.0X 1.5X 1.5-2.25; the operation process of "opening the power n" (taking n as 7 as an example, the value of n can be flexibly adjusted according to actual conditions) is as follows:
Figure BDA0002181175250000132
similarly, the calculation process of other attribute dimensions is similar to the above process, and then the cluster weight WI of the attribute dimension creating the cue is 1.12/(1.12+1.12+1.12+1.12+0.79+0.79) ═ 0.1847. Similarly, the clustering weights of the attribute features such as the number of orders completed by the vehicle, the number of sharing times, follow-up clues, and the number of registered employees can be obtained according to the above calculation processes, and are not described herein again.
And step B3, carrying out consistency check on the clustering weight of each attribute dimension.
And introducing a check coefficient CR (English: Consistency Ratio, Chinese: random Consistency Ratio) to judge whether the calculated clustering weight WI passes the Consistency check. If CR is less than 0.1, the calculated clustering weight WI is considered to pass consistency check; otherwise, the CR may be calculated again after further adjusting the relevant parameters (e.g., the relative importance coefficient between each attribute dimension and the target attribute dimension, the value of n, etc.).
Wherein, CI (english: Consistency Index, chinese: Consistency Index) is (AVERGE (AWI/WI) -k) × (k-1), AVERGE represents the number of averages, k represents the order (i.e. the number of attribute dimensions, in this embodiment, k is 6); the RI (english: Random Index, chinese: Random consistency Index) adopts the standard values shown in table 11 (different standards for consistency check are different, and the value of RI also has a slight difference); AWI (english: all weight, chinese: total sorting weight) sum (relative importance coefficient of attribute dimension × clustering weight).
TABLE 11
Order of the scale 1 2 3 4 5 6 7 8 9 10
RI 0 0 0.58 0.90 1.12 1.24 1.32 1.41 1.45 1.49
Taking the attribute dimension of creating a thread as an example, AWI is 1.0 × 0.1847+1.0 × 0.1847+1.0 × 0.1847+1.0 × 0.1847+1.5 × 0.1305+1.5 × 0.1305 is 1.13; then CR CI/RI 0.022373867/1.24 0.0169499< 0.1; therefore, it can be determined that the clustering weight WI of each attribute dimension passes the consistency check.
And 203, classifying the plurality of target vehicle merchants by the background server by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the plurality of target vehicle merchants.
Wherein the target clustering algorithm is a Kmeans clustering algorithm (English: k-means clustering algorithm, Chinese: k-means clustering algorithm). The Kmeans clustering algorithm comprises the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.
In the embodiment of the present application, the process of classifying a plurality of target car dealers by the background server using the Kmeans clustering algorithm may include the following steps:
and C1, randomly selecting K target vehicle dealers from the plurality of target vehicle dealers, and taking the selected K target vehicle dealers as K clustering centers.
And K represents the number of types obtained by classifying the target car dealers, wherein the value of K can be obtained according to experience.
Alternatively, the value may be taken according to the actual classification requirement, for example, if the user needs to classify a plurality of target car dealers into 3 grades, K is 3.
Alternatively, the prediction can be performed by a neural network learning method. For example, 5000 target vehicle dealers are selected, the target vehicle dealers are classified according to feature data in a target feature data set of the 5000 target vehicle dealers and operation experience, categories to which the target vehicle dealers belong are recorded, and the categories are recorded as first classification results of the target vehicle dealers.
Then, from the 5000 target vehicle dealers' target feature data sets, 70% of the target feature data sets are randomly selected as training data sets, and the remaining 30% are used as testing data sets. And randomly giving a K value, training the feature data of a plurality of target feature data sets in the training data set by adopting a neural network learning method so as to classify a plurality of target car dealers in the training data set into K classes, and forming a neural network classification model through cyclic learning. And inputting the target characteristic data included in the target characteristic data set of the target vehicle trader into a neural network classification model aiming at each target vehicle trader in the test data set, and outputting the class of the target vehicle trader by the neural network classification model to obtain a second classification result of the target vehicle trader.
Aiming at each target vehicle businessman in the test data set, comparing a first classification result and a second classification result of the same target vehicle businessman, judging that the second classification result is accurate when the first classification result is the same as the second classification result, judging that the second classification result is inaccurate when the first classification result is different from the second classification result, counting the classification accuracy, as shown in figure 4, wherein figure 4 shows the accuracy of the classification of the target vehicle businessman by the common classification method and the accuracy of the classification of the target vehicle businessman by the classification method provided by the embodiment of the application, as can be seen from figure 4, the classification method provided by the embodiment of the application classifies based on the clustering weight of each attribute dimension and the feature data in the target feature data sets of a plurality of target vehicle businessmans, and the accuracy of the obtained classification result is generally higher than the classification result of the common classification method, and the accuracy of classification is highest when K is 6, so K is determined to be 6.
On the basis of determining the K value, optionally, the target feature data sets respectively corresponding to the K clustering centers may be:
Dk=[(0.33,0.41,0.44,0.56,...,0.11),(0.20,0.22,0.40,0.35,...,0.52),(0.40,0.35,0.37,0.26,...,0.30),...,(0.25,0.27,0.59,0.64,...,0.33)]=[D1,D2,D3,...,Dk]。
and C2, aiming at each target vehicle quotient, calculating the distance from the target vehicle quotient to K clustering centers respectively according to the clustering weight of each attribute dimension and the feature data in the target feature data set of the target vehicle quotient.
For example, the clustering weight W of each attribute dimension is known as W1,W2,W3,...,Wm],WmThe cluster weight representing the mth attribute dimension. Target characteristic data set of target vehicle quotient a is (0.34,0.45,0.44, 0.56.., 0.50), then target vehicle quotient a goes to cluster center D1The distance d1 of (c) is calculated as follows:
Figure BDA0002181175250000161
similarly, the target vehicle dealer A to the rest cluster centers D can be obtained according to the calculation process2,D3,...,DkDistance d of2,d3,...,dkThus, a distance set d ═ d between the target vehicle quotient A and each cluster center is obtained1,d2,d3,...,dk]。
And C2, merging the target car dealers into the clustering centers corresponding to the minimum distance according to the distances from the target car dealers to the K clustering centers respectively.
For example, the minimum distance dmin d of the distance set d from the target vehicle dealer a to each cluster center2Merging the target vehicle quotient A to the minimum distance d2Corresponding cluster center D2In (1). Similarly, a plurality of target car dealers can be respectively merged to different clustering centers.
Optionally, in this embodiment of the application, the cluster center and the object allocated to the cluster center represent a cluster, and after a new object is added to the cluster center, the target feature data set of the cluster center may be determined again according to all the objects in the cluster center. The method for re-determining the target feature data set of the cluster center according to all the objects in the cluster center may be: and calculating the average value or the median of the feature data of all the objects in the clustering center relative to the same attribute dimension, and taking the obtained average value or median as the feature data of the corresponding attribute dimension in the new target feature data set of the clustering center.
For example, cluster center D2Representing a cluster with the target vehicle trader A according to a cluster center D2The corresponding target feature data set (0.20,0.22,0.40, 0.35.., 0.52) and the target feature data set (0.34,0.45,0.44, 0.56.., 0.50) of the target driver a recalculate the cluster center D2The target feature data set of (1). The attribute dimension W1 is determined to have characteristic data corresponding to W1 as (0.2+0.34)/2 ═ 0.27, (0.22+0.45)/2 ═ 0.335, (0.40+0.44)/2 ═ 0.42, (0.35+0.56)/2 ═ 0.455, and … … (0.52+0.50)/2 ═ 0.51, and then the cluster center D is determined to be the cluster center D2Is (0.27,0.335,0.42, 0.455.., 0.51).
The classification method for the car dealers can improve the classification accuracy of the car dealers. The background server calculates a clustering weight for each attribute dimension of n attribute dimensions included in the target feature data set, the n attribute dimensions can be considered comprehensively, the importance degree of each attribute dimension can be determined, and further the background server combines the feature data in the target feature data set of the target vehicle dealer and the clustering weight of the attribute dimension corresponding to each feature data to classify the target vehicle dealer.
Referring to fig. 5, another method for calculating cluster weights of attribute dimensions according to an embodiment of the present application is shown, where the method for classifying vehicle dealers can be applied to the background server shown in fig. 1. In the embodiment of the present application, the attribute dimensions of the target vehicle dealer may be divided into multiple hierarchies, and a process of calculating the clustering weight for each attribute dimension may be as shown in fig. 5, which includes the following steps:
step 501, determining a plurality of first-order attribute dimensions according to the n attribute dimensions, wherein the first-order attribute dimensions include a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one of the n attribute dimensions.
And classifying the attribute dimensions corresponding to the feature data in the target feature data sets of the target vehicle dealers. In the embodiment of the present application, two hierarchies are taken as an example for explanation, where the first-order attribute dimension includes a thread related number, a client transaction related number, a sharing related number, an SAAS (english: Software as a Service) activity related number, a last-month thread related number, and a last-month SAAS activity related number.
Wherein each first order attribute dimension comprises a second order attribute dimension. For example: the thread relevance numbers include a second order attribute dimension of: number of created threads, number of 2 hour follow-up threads, number of 24 hour follow-up threads, and number of follow-up threads. The customer transaction correlation number includes a second order attribute dimension of: creating order number, giving cars to finish order number, recording the number of clients and following the number of clients. The second-order attribute dimension included in the sharing correlation number is: the number of the sales promotion total number, the number of the coupons, the number of the marketing activities, the number of sharing times and the number of the two-dimension code popularization. The SAAS activity correlation includes a second order attribute dimension of: the method comprises the steps of accumulating login times, accumulating boss login times, accumulating employee login times, accumulating login number, accumulating boss login number, accumulating employee login number, newly adding accounts, accumulating login days in the month, and opening the owner's age of the buyer. The last month thread correlation number comprises a second-order attribute dimension as follows: the number of threads created in the previous month, the number of threads followed in 2 hours in the previous month, the number of threads followed in the previous month in 24 hours in the previous month, the number of clients entered in the previous month, the number of clients followed in the previous month, the number of marketing activities in the previous month, and the number of shares in the previous month. The second-order attribute dimensions included in the last-month SAAS activity correlation are: the number of the last-month accumulated login, the number of the last-month accumulated boss login, the number of the last-month accumulated employee login, the number of the last-month accumulated boss login, and the number of the last-month accumulated employee login.
It should be noted that the first-order attribute dimension and the second-order attribute dimension shown in the present application are only exemplary, and the attribute dimension corresponding to the target vehicle dealer may include, but is not limited to, the attribute dimensions disclosed above.
Step 502, obtaining a clustering weight of each first-order attribute dimension according to a relative importance coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions.
The relative importance coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions is obtained, as shown in table 12, and the clustering weight of each first-order attribute dimension can be obtained by referring to the calculation process disclosed in step 202 according to the relative importance coefficient in table 12, as shown in table 13.
TABLE 12
Figure BDA0002181175250000191
Watch 13
Figure BDA0002181175250000201
Step 503, for each first-order attribute dimension, obtaining a plurality of second-order attribute dimensions included in the first-order attribute dimension, and obtaining a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions.
This step is used to calculate a first clustering weight for each of a plurality of second-order attribute dimensions that belong to the same first-order attribute dimension.
Taking the second-order attribute dimension included in the thread related number as an example for explanation, a relative importance degree coefficient between every two attribute dimensions of the created thread number, the 2-hour follow-up thread number, the 24-hour follow-up thread number and the follow-up thread number included in the second-order attribute dimension is obtained, as shown in table 14, and a first clustering weight of each attribute dimension of the multiple attribute dimensions included in the second-order attribute dimension is obtained, as shown in table 15.
The method for calculating the first clustering weight of each attribute dimension of the second-order attribute dimension is similar to the method disclosed in step 202, and is not described herein again.
TABLE 14
Creating threads 2 hour follow-up thread 24 hour follow-up cue Follow-up thread
Creating threads 1.00 0.50 0.67 1.00
2 hour follow-up clue 2.00 1.00 1.33 2.00
24 hour follow-up cue 1.50 0.75 1.00 1.50
Follow-up thread 1.00 0.50 0.67 1.00
Watch 15
Multiplication by rows Power of n First clustering weight WI AWI AWI/WI
Creating threads 0.33 0.80 0.20 0.74 3.79
2 hour follow-up thread 5.33 1.40 0.34 1.48 4.36
24-hour follow-up line Cord 1.69 1.11 0.27 1.11 4.11
Follow-up thread 0.33 0.80 0.20 0.74 3.79
Step 504, for each first-order attribute dimension, calculating a clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and the first clustering weight of each second-order attribute dimension in the first-order attribute dimension.
As described above, taking the attribute dimension as the number of thread correlations as an example, the clustering weight of each attribute dimension in the second-order attribute dimension can be as shown in table 16.
TABLE 16
Figure BDA0002181175250000211
Similarly, the process of calculating the clustering weight of each attribute dimension included in the second-order attribute dimensions included in the other first-order attribute dimensions is similar to the above process, and is not described herein again.
Please refer to fig. 6, which is a flowchart illustrating another vehicle dealer classification method according to an embodiment of the present disclosure, where the vehicle dealer classification method may be applied to the background server shown in fig. 1. As shown in fig. 6, the car dealer classification method may include the steps of:
step 601, the background server obtains an attribute dimension set.
The attribute dimension set comprises all attribute dimensions corresponding to the initial characteristic data sets of the target vehicle merchants.
In the example above, the attribute dimension set { creating a clue, returning to a car to complete ordering, sharing times, following clue, accumulated number of staff logins, accumulated boss login times, accumulated login times, two-dimensional code promotion, number of registered customers } can be obtained from table 6.
Step 602, for each attribute dimension in the attribute dimension set, the background server obtains the number of target sets corresponding to the attribute dimension, where a target set is an initial feature data set containing feature data corresponding to the attribute dimension.
As described above, in table 6, the target set corresponding to the attribute dimension of the sharing times includes: the initial characteristic data set of the target vehicle trader 1, the initial characteristic data set of the target vehicle trader 3, the initial characteristic data set of the target vehicle trader 4 and the initial characteristic data set of the target vehicle trader 5, and therefore the number of the target sets corresponding to the attribute dimension of the creation clue is 4.
The target set corresponding to the attribute dimension of the accumulated number of the staff login people comprises the following steps: since the initial characteristic data set of the target vehicle dealer 2, the initial characteristic data set of the target vehicle dealer 3, and the initial characteristic data set of the target vehicle dealer 5 are set, the number of target sets corresponding to the attribute dimension of the cumulative number of registered employees is 3.
Similarly, the number of target sets corresponding to other attribute dimensions may be calculated, which is not described herein again.
Step 603, the background server normalizes the feature data in the initial feature data sets of the multiple target vehicle dealers according to the number of the target sets corresponding to each attribute dimension in the attribute dimension sets.
In this embodiment of the present application, the normalization processing on the feature data in the initial feature data sets of multiple target vehicle dealers may include two cases, which are specifically as follows:
in a first case, comprising the steps of:
and S1, for each attribute dimension in the attribute dimension set, when the ratio of the number of the target set corresponding to the attribute dimension to the number of the initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold, the background server acquires the median of the feature data corresponding to the attribute dimension contained in the target set.
Optionally, in this embodiment of the present application, the missing threshold may be 0.6.
As described above, taking the attribute dimension of the sharing times as an example, the number of the target sets corresponding to the attribute dimension is 4, and as can be seen from the contents shown in table 6, if the number of the initial feature data sets of the plurality of target vehicle dealers is 5, the ratio is 4/5 ═ 0.8, which is greater than the loss threshold, so that the background server obtains the median 20 in the target feature data corresponding to each of the target vehicle dealers 1, 3, 4, and 5 corresponding to the attribute dimension of the sharing times.
And S2, regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the plurality of target car dealers.
Taking the above example as a reference, the other initial feature data sets except the target set in the initial feature data sets of the plurality of target vehicle merchants are the initial feature data set of the target vehicle merchant 2, the median 20 is used as the target feature data of the attribute dimension, i.e. the corresponding sharing frequency of the target vehicle merchant 2, and the target feature data is written into the initial feature data set of the target vehicle merchant 2.
Similarly, the computation process for the other attribute dimensions is similar to that described above.
In the second case:
and for each attribute dimension in the attribute dimension set, deleting the feature data corresponding to the attribute dimension in the initial feature data sets of the multiple target vehicle dealers when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the multiple target vehicle dealers is smaller than a missing threshold.
Optionally, in this embodiment of the present application, the deletion threshold may be 0.6.
Taking the attribute dimension of the follow-up cue as an example, the target set corresponding to the attribute dimension of the follow-up cue includes the initial feature data set of the target car dealer 1, and then the number of the target sets corresponding to the attribute dimension of the follow-up cue is 1.
As can be seen from table 6, if the number of the initial feature data sets of the target vehicle dealers is 5, and if the ratio 1/5 is 0.2 and is smaller than the loss threshold, the feature data corresponding to the attribute dimension of the follow-up clue is deleted from the initial feature data sets of the target vehicle dealers. Namely, the feature data 30 of the attribute dimension of the follow-up clue in the initial feature data set of the target car dealer 1 is deleted.
By using the processing methods shown in the first and second cases, the table 17 can be obtained by processing each attribute dimension in the table 6.
TABLE 17
Figure BDA0002181175250000241
Wherein, the thickened data is the characteristic data which is filled by adopting a median.
Optionally, after step S2, the feature data in table 17 may be further processed by the method shown in step a2, which is not described herein again.
Referring to fig. 7, a block diagram of a vehicle dealer classifying device provided in an embodiment of the present application is shown, where the vehicle dealer classifying device may be configured in the implementation environment shown in fig. 1. As shown in fig. 7, the vehicle-merchant classification apparatus may include a target feature obtaining module 701, a cluster weight obtaining module 702, and a classification module 703.
A target feature obtaining module 701, configured to obtain target feature data sets of multiple target vehicle dealers, where a target feature data set of each target vehicle dealer includes feature data of each target vehicle dealer corresponding to n attribute dimensions, and n is a positive integer;
a clustering weight obtaining module 702, configured to obtain a clustering weight of each attribute dimension of the n attribute dimensions according to a relative importance coefficient between every two attribute dimensions of the n attribute dimensions;
the classification module 703 is configured to classify the multiple target vehicle dealers by using a target clustering algorithm based on the clustering weight of each attribute dimension and the feature data in the target feature data set of the multiple target vehicle dealers.
In an embodiment of the present application, the clustering weight obtaining module 702 is further configured to obtain a relative importance degree coefficient between each attribute dimension of the n attribute dimensions and the target attribute dimension, to obtain n relative importance degree coefficients, where the target attribute dimension is any one of the n attribute dimensions; and multiplying the n relative importance degree coefficients, and calculating the clustering weight of the target attribute dimension according to the product obtained by multiplication.
In an embodiment of the present application, the cluster weight obtaining module 702 is further configured to determine a plurality of first-order attribute dimensions according to the n attribute dimensions, where the first-order attribute dimensions include a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one of the n attribute dimensions;
acquiring a clustering weight of each first-order attribute dimension according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions;
aiming at each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance degree coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions;
and aiming at each first-order attribute dimension, calculating the clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and the first clustering weight of each second-order attribute dimension in the first-order attribute dimension.
In an embodiment of the present application, the target feature obtaining module 701 is specifically configured to obtain initial feature data sets of a plurality of target vehicle dealers, where the initial feature set of the ith target vehicle dealerSum including the ith target car dealer and kiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer; and carrying out normalization processing on the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants to obtain target characteristic data sets of the plurality of target vehicle merchants.
In an embodiment of the present application, the target feature obtaining module 701 is further configured to obtain an attribute dimension set, where the attribute dimension set includes all attribute dimensions corresponding to initial feature data sets of multiple target vehicle dealers; for each attribute dimension in the attribute dimension set, acquiring the number of target sets corresponding to the attribute dimension, wherein the target sets are initial characteristic data sets containing characteristic data corresponding to the attribute dimension; and according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set, performing normalization processing on the feature data in the initial feature data sets of the plurality of target vehicle merchants.
In an embodiment of the application, the target feature obtaining module 701 is further configured to, for each attribute dimension in the attribute dimension set, obtain a median of feature data corresponding to the attribute dimension included in the target set when a ratio of a number of the target set corresponding to the attribute dimension to a number of initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold; and regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the plurality of target vehicle dealers.
In an embodiment of the application, the target feature obtaining module 701 is further configured to, for each attribute dimension in the attribute dimension set, delete feature data corresponding to the attribute dimension in the initial feature data sets of the multiple target vehicle dealers when a ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the multiple target vehicle dealers is smaller than a deletion threshold.
In one embodiment of the present application, the target clustering algorithm is a Kmeans clustering algorithm.
In one embodiment of the present application, there is provided a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to n attribute dimensions respectively, and n is a positive integer; acquiring a clustering weight of each attribute dimension in the n attribute dimensions according to a relative importance degree coefficient between every two attribute dimensions in the n attribute dimensions; and classifying the plurality of target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the plurality of target vehicle merchants.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of: obtaining a relative importance degree coefficient between each attribute dimension of the n attribute dimensions and a target attribute dimension to obtain n relative importance degree coefficients, wherein the target attribute dimension is any one of the n attribute dimensions; and multiplying the n relative importance degree coefficients, and calculating the clustering weight of the target attribute dimension according to the product obtained by multiplication.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of: determining a plurality of first-order attribute dimensions according to the n attribute dimensions, wherein the first-order attribute dimensions comprise a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one of the n attribute dimensions; acquiring a clustering weight of each first-order attribute dimension according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions; aiming at each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance degree coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions; and aiming at each first-order attribute dimension, calculating the clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and the first clustering weight of each second-order attribute dimension in the first-order attribute dimension.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of:
acquiring initial characteristic data sets of a plurality of target vehicle merchants, wherein the initial characteristic set of the ith target vehicle merchant comprises the sum of the ith target vehicle merchant and kiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer; and carrying out normalization processing on the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants to obtain target characteristic data sets of the plurality of target vehicle merchants.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of: acquiring an attribute dimension set, wherein the attribute dimension set comprises all attribute dimensions corresponding to initial characteristic data sets of a plurality of target vehicle merchants; for each attribute dimension in the attribute dimension set, acquiring the number of target sets corresponding to the attribute dimension, wherein the target sets are initial feature data sets containing feature data corresponding to the attribute dimension; and according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set, performing normalization processing on the feature data in the initial feature data sets of the plurality of target vehicle merchants.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of: for each attribute dimension in the attribute dimension set, when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold value, acquiring the median of the feature data corresponding to the attribute dimension contained in the target sets; and regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the plurality of target vehicle dealers.
In one embodiment of the application, the processor when executing the computer program may further implement the steps of: and for each attribute dimension in the attribute dimension set, deleting the feature data corresponding to the attribute dimension in the initial feature data sets of the target vehicle dealers when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the target vehicle dealers is smaller than a missing threshold.
The implementation principle and technical effect of the computer device provided by the embodiment of the present application are similar to those of the method embodiment described above, and are not described herein again.
In an embodiment of the application, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of:
acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to n attribute dimensions respectively, and n is a positive integer; acquiring a clustering weight of each attribute dimension in the n attribute dimensions according to a relative importance degree coefficient between every two attribute dimensions in the n attribute dimensions; and classifying the plurality of target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the plurality of target vehicle merchants.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: obtaining a relative importance degree coefficient between each attribute dimension of the n attribute dimensions and a target attribute dimension to obtain n relative importance degree coefficients, wherein the target attribute dimension is any one of the n attribute dimensions; and multiplying the n relative importance degree coefficients, and calculating the clustering weight of the target attribute dimension according to the product obtained by multiplication.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: determining a plurality of first-order attribute dimensions according to the n attribute dimensions, wherein the first-order attribute dimensions comprise a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one of the n attribute dimensions; acquiring a clustering weight of each first-order attribute dimension according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions; aiming at each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance degree coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions; and aiming at each first-order attribute dimension, calculating the clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and the first clustering weight of each second-order attribute dimension in the first-order attribute dimension.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: acquiring initial characteristic data sets of a plurality of target vehicle merchants, wherein the initial characteristic set of the ith target vehicle merchant comprises the sum of the ith target vehicle merchant and kiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer; and carrying out normalization processing on the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants to obtain target characteristic data sets of the plurality of target vehicle merchants.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: acquiring an attribute dimension set, wherein the attribute dimension set comprises all attribute dimensions corresponding to initial characteristic data sets of a plurality of target vehicle merchants; for each attribute dimension in the attribute dimension set, acquiring the number of target sets corresponding to the attribute dimension, wherein the target sets are initial characteristic data sets containing characteristic data corresponding to the attribute dimension; and according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set, performing normalization processing on the feature data in the initial feature data sets of the plurality of target vehicle merchants.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: for each attribute dimension in the attribute dimension set, when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold value, acquiring the median of the feature data corresponding to the attribute dimension contained in the target sets; and regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the plurality of target vehicle dealers.
In one embodiment of the application, the computer program, when executed by the processor, may further implement the steps of: and for each attribute dimension in the attribute dimension set, deleting the feature data corresponding to the attribute dimension in the initial feature data sets of the target vehicle dealers when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the target vehicle dealers is smaller than a missing threshold.
The implementation principle and technical effect of the computer-readable storage medium provided in the embodiment of the present application are similar to those of the method embodiment described above, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the claims. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims (9)

1. A method for classifying a vehicle and a merchant, the method comprising:
acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to n attribute dimensions;
determining a plurality of first-order attribute dimensions according to the n attribute dimensions, wherein the first-order attribute dimensions comprise a plurality of second-order attribute dimensions, and the second-order attribute dimensions are any one attribute dimension in the n attribute dimensions;
acquiring a clustering weight of each first-order attribute dimension according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions;
for each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions;
for each first-order attribute dimension, calculating a clustering weight of each second-order attribute dimension in the first-order attribute dimension according to the product of the clustering weight of each first-order attribute dimension and a first clustering weight of each second-order attribute dimension in the first-order attribute dimension;
and classifying the target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the target vehicle merchants.
2. The method according to claim 1, wherein the obtaining a clustering weight of each of the n attribute dimensions according to a relative importance degree coefficient between every two attribute dimensions of the n attribute dimensions comprises:
obtaining a relative importance degree coefficient between each attribute dimension of the n attribute dimensions and a target attribute dimension to obtain n relative importance degree coefficients, wherein the target attribute dimension is any one of the n attribute dimensions;
and multiplying the n relative importance degree coefficients, and calculating the clustering weight of the target attribute dimension according to the product obtained by multiplication.
3. The method of claim 1, wherein said obtaining a target feature data set for a plurality of target vehicle dealers comprises:
acquiring initial characteristic data sets of the target vehicle merchants, wherein the initial characteristic set of the ith target vehicle merchant comprises the sum of the ith target vehicle merchant and the sum of the kth target vehicle merchantiCharacteristic data respectively corresponding to attribute dimensions, i is a positive integer, kiIs a positive integer;
and carrying out normalization processing on the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants to obtain the target characteristic data sets of the plurality of target vehicle merchants.
4. The method according to claim 3, wherein the normalizing the characteristic data in the initial characteristic data sets of the plurality of target vehicle dealers comprises:
acquiring an attribute dimension set, wherein the attribute dimension set comprises all attribute dimensions corresponding to the initial feature data sets of the target vehicle merchants;
for each attribute dimension in the attribute dimension set, acquiring the number of target sets corresponding to the attribute dimension, wherein the target sets are initial characteristic data sets containing characteristic data corresponding to the attribute dimension;
and normalizing the characteristic data in the initial characteristic data sets of the plurality of target vehicle merchants according to the number of the target sets corresponding to each attribute dimension in the attribute dimension sets.
5. The method according to claim 4, wherein the normalizing the feature data in the initial feature data sets of the plurality of target vehicle dealers according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set comprises:
for each attribute dimension in the attribute dimension set, when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the plurality of target vehicle dealers is greater than or equal to a missing threshold, acquiring a median of the feature data corresponding to the attribute dimension contained in the target set;
and regarding each attribute dimension in the attribute dimension set, taking the median as the feature data of the attribute dimension corresponding to other initial feature data sets except the target set in the initial feature data sets of the multiple target vehicle dealers.
6. The method according to claim 4, wherein the normalizing the feature data in the initial feature data sets of the plurality of target vehicle dealers according to the number of the target sets corresponding to each attribute dimension in the attribute dimension set comprises:
and for each attribute dimension in the attribute dimension set, deleting the feature data corresponding to the attribute dimension in the initial feature data sets of the target vehicle dealers when the ratio of the number of the target sets corresponding to the attribute dimension to the number of the initial feature data sets of the target vehicle dealers is smaller than a missing threshold.
7. A vehicle-to-commercial classification apparatus, characterized in that the apparatus comprises:
the target characteristic acquisition module is used for acquiring target characteristic data sets of a plurality of target vehicle merchants, wherein the target characteristic data set of each target vehicle merchant comprises characteristic data of each target vehicle merchant corresponding to the n attribute dimensions respectively;
a cluster weight obtaining module, configured to determine a plurality of first-order attribute dimensions according to the n attribute dimensions, where the first-order attribute dimensions include a plurality of second-order attribute dimensions, and the second-order attribute dimension is any one of the n attribute dimensions; acquiring a clustering weight of each first-order attribute dimension according to a relative importance degree coefficient between any two first-order attribute dimensions in the plurality of first-order attribute dimensions; for each first-order attribute dimension, acquiring a plurality of second-order attribute dimensions included in the first-order attribute dimension, and acquiring a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to a relative importance coefficient between any two second-order attribute dimensions in the plurality of second-order attribute dimensions; for each first-order attribute dimension, calculating a clustering weight of each second-order attribute dimension in the first-order attribute dimensions according to the product of the clustering weight of each first-order attribute dimension and a first clustering weight of each second-order attribute dimension in the first-order attribute dimensions;
and the classification module is used for classifying the target vehicle merchants by adopting a target clustering algorithm based on the clustering weight of each attribute dimension and the characteristic data in the target characteristic data set of the target vehicle merchants.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN201910796773.8A 2019-08-27 2019-08-27 Vehicle and merchant classification method and device, computer equipment and storage medium Active CN110610200B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910796773.8A CN110610200B (en) 2019-08-27 2019-08-27 Vehicle and merchant classification method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910796773.8A CN110610200B (en) 2019-08-27 2019-08-27 Vehicle and merchant classification method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110610200A CN110610200A (en) 2019-12-24
CN110610200B true CN110610200B (en) 2022-05-20

Family

ID=68890456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910796773.8A Active CN110610200B (en) 2019-08-27 2019-08-27 Vehicle and merchant classification method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110610200B (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6260038B1 (en) * 1999-09-13 2001-07-10 International Businemss Machines Corporation Clustering mixed attribute patterns
US7689484B2 (en) * 2003-07-29 2010-03-30 Ford Motor Company Method and system for financing acquisition of vehicles
WO2015001416A1 (en) * 2013-07-05 2015-01-08 Tata Consultancy Services Limited Multi-dimensional data clustering
CN105447117B (en) * 2015-11-16 2019-03-26 北京邮电大学 A kind of method and apparatus of user's cluster
CN109522495B (en) * 2018-11-21 2021-04-06 世纪龙信息网络有限责任公司 Data analysis method and device, computer equipment and storage medium
CN109472322B (en) * 2018-12-04 2020-11-27 东软集团股份有限公司 Classification method and device based on clustering, storage medium and electronic equipment
CN109858518B (en) * 2018-12-26 2021-07-06 中译语通科技股份有限公司 Large data set clustering method based on MapReduce
CN110008321B (en) * 2019-03-07 2021-06-25 腾讯科技(深圳)有限公司 Information interaction method and device, storage medium and electronic device

Also Published As

Publication number Publication date
CN110610200A (en) 2019-12-24

Similar Documents

Publication Publication Date Title
CN108876133B (en) Risk assessment processing method, device, server and medium based on business information
WO2019214248A1 (en) Risk assessment method and apparatus, terminal device, and storage medium
US8489502B2 (en) Methods and systems for multi-credit reporting agency data modeling
CN112990386B (en) User value clustering method and device, computer equipment and storage medium
CN112102073A (en) Credit risk control method and system, electronic device and readable storage medium
CN110766275A (en) Data verification method and device, computer equipment and storage medium
WO2021174699A1 (en) User screening method, apparatus and device, and storage medium
CN112990989A (en) Value prediction model input data generation method, device, equipment and medium
TWM613536U (en) Investment risk scoring system for fund commodities
CN110610200B (en) Vehicle and merchant classification method and device, computer equipment and storage medium
CN112487284A (en) Bank customer portrait generation method, equipment, storage medium and device
CN107545347B (en) Attribute determination method and device for risk prevention and control and server
CN115936841A (en) Method and device for constructing credit risk assessment model
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN114626940A (en) Data analysis method and device and electronic equipment
CN113627997A (en) Data processing method and device, electronic equipment and storage medium
CN114548620A (en) Logistics punctual insurance service recommendation method and device, computer equipment and storage medium
CN114240599A (en) Loan calculation method and device, computer equipment and storage medium
CN114240605A (en) Loan calculation method and device, computer equipment and storage medium
CN113706258A (en) Product recommendation method, device, equipment and storage medium based on combined model
CN112766824A (en) Data processing method and device, electronic equipment and storage medium
CN113240513A (en) Method for determining user credit line and related device
CN113034264A (en) Method and device for establishing customer loss early warning model, terminal equipment and medium
CN111309870A (en) Data rapid searching method and device and computer equipment
CN113256368B (en) Product pushing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant