CN115545113A - KNN algorithm-based user classification method and device - Google Patents

KNN algorithm-based user classification method and device Download PDF

Info

Publication number
CN115545113A
CN115545113A CN202211271947.7A CN202211271947A CN115545113A CN 115545113 A CN115545113 A CN 115545113A CN 202211271947 A CN202211271947 A CN 202211271947A CN 115545113 A CN115545113 A CN 115545113A
Authority
CN
China
Prior art keywords
knn
user classification
algorithm
model
personal information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211271947.7A
Other languages
Chinese (zh)
Inventor
刘京
李亚雄
马小菲
刘航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202211271947.7A priority Critical patent/CN115545113A/en
Publication of CN115545113A publication Critical patent/CN115545113A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a user classification method and a user classification device based on a KNN algorithm, wherein the corresponding method comprises the following steps: receiving personal information of a user; and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm. The KNN classification algorithm adopted by the invention is simple and easy to use, the model training time is short, the established model structure is completely determined according to data, the method accords with the practical situation, is more suitable for the classification situation with larger sample capacity, and is insensitive to abnormal values, so the method has high applicability to the classification of bank loan data, has good prediction effect, and can provide credible reference for the prediction and control of loan default risks.

Description

KNN algorithm-based user classification method and device
Technical Field
The application belongs to the technical field of computer data processing, and particularly relates to a user classification method and device based on a KNN algorithm.
Background
In the prior art, with the increasing market demand, the loan transaction amount of the banking industry is continuously increased. Due to the double consideration of economic benefit maximization and credit risk avoidance, a bank needs to accurately classify the loan states of the loan clients, and realize accurate classification prediction on the basis, so that the bank is helped to control the credit default risk of the clients, loss is effectively avoided, and the requirement of loan risk control of a commercial bank is met.
Disclosure of Invention
The user classification method and device based on the KNN algorithm have the advantages that the KNN classification algorithm has lower training time complexity compared with an SVM (support vector machine) algorithm, no hypothesis is given to data distribution, the classification accuracy is high, the method is insensitive to abnormal points, the method is suitable for classification conditions with large sample capacity, the KNN model only relates to value selection of one parameter K, and the optimal K value can be selected through a cross verification method in application.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, the present invention provides a user classification method based on a KNN algorithm, including:
receiving personal information of a user;
and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
In one embodiment, the method for establishing the KNN user classification model includes:
establishing an initial model of the KNN user classification model based on customer data collected in the early stage by using a KNeighborsClassifier function in a Python machine learning skleern library;
training the initial model through a fit (X _ train, y _ train) function in a skleann library to generate the KNN user classification model.
In one embodiment, the KNN algorithm-based user classification method further includes:
the data set is randomly divided into a training set and a test set using a train _ test _ split function in a machine learning library sklean.
In one embodiment, the KNN algorithm-based user classification method further includes:
preprocessing the personal information;
and carrying out normalization processing on the training set.
In a second aspect, the present invention provides a user classification method and device based on a KNN algorithm, where the device includes:
the personal information receiving module is used for receiving personal information of a user;
and the user classification module is used for classifying a plurality of users according to the personal information and a pre-established KNN user classification model, and the KNN user classification model is generated based on a KNN algorithm.
In one embodiment, the user classification apparatus based on KNN algorithm further includes: the classification model establishing module is used for establishing the KNN user classification model, and comprises the following steps:
the initial model establishing module is used for establishing an initial model of the KNN user classification model based on customer data collected in the early stage by utilizing a KNeighborsClassifier function in a Python machine learning skleann library;
a classification model establishing unit, configured to train the initial model through a fit (X _ train, y _ train) function in a skleann library, so as to generate the KNN user classification model.
In one embodiment, the user classification device based on the KNN algorithm further includes:
and the data set dividing module is used for randomly dividing the data set into a training set and a test set by using a train _ test _ split function in a machine learning library skleann.
In one embodiment, the user classification device based on the KNN algorithm further includes:
the information preprocessing module is used for preprocessing the personal information;
and the training set normalization module is used for carrying out normalization processing on the training set.
In a third aspect, the invention provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of a KNN algorithm based user classification method.
In a fourth aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the KNN algorithm-based user classification method when executing the program.
In a fifth aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of a KNN algorithm-based user classification method.
As can be seen from the above description, an embodiment of the present invention provides a user classification method and device based on a KNN algorithm, including: firstly, receiving personal information of a user; and then classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm. The KNN classification algorithm adopted by the invention is simple and easy to use, the model training time is short, the established model structure is completely determined according to data, the method accords with the practical situation, is more suitable for the classification situation with larger sample capacity, and is insensitive to abnormal values, so the method has high applicability to the classification of bank loan data, has good prediction effect, and can provide credible reference for the prediction and control of loan default risks.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a first user classification method based on a KNN algorithm according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a user classification method based on the KNN algorithm according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of step 300 in the KNN algorithm-based user classification method according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of a user classification method based on the KNN algorithm according to an embodiment of the present invention;
fig. 5 is a schematic flowchart of a user classification method based on a KNN algorithm according to an embodiment of the present invention;
fig. 6 is a schematic flowchart of a KNN algorithm-based user classification method according to an embodiment of the present invention;
fig. 7 is a block diagram of a user classification apparatus based on KNN algorithm according to an embodiment of the present invention;
FIG. 8 is a block diagram of a KNN algorithm based user classifying device according to an embodiment of the present invention;
FIG. 9 is a block diagram of a classification model building module 30 provided by an embodiment of the present invention;
fig. 10 is a block diagram of a user classification device based on the KNN algorithm according to the embodiment of the present invention;
fig. 11 is a block diagram of a user classification apparatus based on KNN algorithm according to the embodiment of the present invention;
fig. 12 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to the technical scheme, the data acquisition, storage, use, processing and the like meet relevant regulations of national laws and regulations.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
The embodiment of the present invention provides a specific implementation manner of a user classification method based on a KNN algorithm, and referring to fig. 1, the method specifically includes the following contents:
step 100: receiving personal information of a user;
specifically, collecting data information related to loan of a bank customer, and preprocessing sample data: including filtering out fields that have an excessive data loss rate or are not relevant to loan status prediction, performing data deduplication based on customer identity, supplementing missing values based on actual business, and so forth.
Step 200: and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
In the prior art, the Logistic regression algorithm is one of the most commonly used algorithms applicable to the field of two-classification, but the defects of the Logistic regression algorithm are obvious, namely, under-fitting is easy to generate, and the classification precision is not high; the K-means algorithm belongs to a clustering algorithm and can also achieve the purpose of classification, but the grouping number parameter K of the algorithm is an input parameter, and the specific value of the parameter has great influence on the classification result; although the SVM algorithm has a good classification effect, the SVM algorithm is difficult to implement on large-scale training samples, and the training time complexity is high.
Compared with the SVM support vector machine algorithm, the KNN classification algorithm adopted in the step 200 has lower training time complexity, no hypothesis is given on data distribution, the classification accuracy is high, the method is insensitive to abnormal points, and the method is suitable for the classification condition with larger sample capacity.
Specifically, the distance between the test data and each training data is calculated; sorting according to the increasing relation of the distances; selecting K points with the minimum distance; determining the occurrence frequency of the category where the first K points are located; and returning the category with the highest occurrence frequency in the first K points as the prediction classification of the test data.
As can be seen from the above description, an embodiment of the present invention provides a user classification method based on a KNN algorithm, including: firstly, receiving personal information of a user; and then classifying the plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm. The KNN classification algorithm adopted by the invention is simple and easy to use, the model training time is short, the established model structure is completely determined according to data, the method accords with the practical situation, is more suitable for the classification situation with larger sample capacity, and is insensitive to abnormal values, so the method has high applicability to the classification of bank loan data, has good prediction effect, and can provide more credible reference for prediction and control of loan default risks.
In an embodiment, referring to fig. 2, the KNN algorithm-based user classification method further includes:
step 300: establishing the KNN user classification model; next, referring to fig. 3, step 300 further comprises:
step 301: establishing an initial model of the KNN user classification model based on the customer data collected in the early stage by using a KNeighborsClassifier function in a Python machine learning skleern library;
before the model is created, the independent variables and the dependent variables, so-called features and classifications, are set: the bank client personal basic information, annual income, liability rate, credit card liability, other liability, loan amount, etc. are set as independent variables, and the loan status is set as a dependent variable as a result of the desired prediction.
Step 302: training the initial model through a fit (X _ train, y _ train) function in a skleann library to generate the KNN user classification model.
Firstly, a training set and a test set are divided: in order to improve the generalization capability of the model, the data set needs to be divided into a training set and a test set according to a random sampling mode, wherein the training set is used for learning or training the model, and the test set is used for evaluating the prediction effect of the trained model. The data volume size ratio of the training set and the test set is generally 7 or 6, and the data set can be randomly divided into the training set and the test set by using a train _ test _ split function in a machine learning library sklean. the train _ test _ split function has 4 parameters, which mean:
train _ data: sample data to be partitioned
train _ target: result of sample data to be partitioned
test _ size: ratio of test data to sample data
random _ state: and setting a random number seed to ensure that the random numbers generated each time are the same, and when the random number seed is set to be 0 or null, the random numbers generated each time are different.
And then, substituting the training set into the KNN model to train the model: the KNN model can be built using the KNeighborsClassifier function in the sklern library, wherein the KNeighborsClassifier function refers to the more common parameter meanings explained below:
n _ neighbors: namely K, the default number of the neighbor is 5;
weight: the predicted weight function is a probability value;
algorithm: the set algorithm for calculating the nearest neighbor has the optional values of auto, ball _ tree, kd _ tree and brute, the default is auto, and the algorithm represents that a proper algorithm is automatically selected;
p: p =1 indicates the use of manhattan distance, p =2 indicates the use of euclidean distance;
then, the fit (X _ train, y _ train) function is called again to train the model.
In an embodiment, referring to fig. 4, the KNN algorithm-based user classification method further includes:
step 400: the data set is randomly divided into a training set and a test set using a train _ test _ split function in a machine learning library sklean.
In an embodiment, referring to fig. 5, the KNN algorithm-based user classification method further includes:
step 500: preprocessing the personal information;
preprocessing sample data: including filtering out fields that have an excessive data loss rate or are not relevant to loan status prediction, performing data deduplication based on customer identity, supplementing missing values based on actual business, and so forth.
Step 600: and carrying out normalization processing on the training set.
Sample data normalization processing: if the dimension of the independent variable (characteristic) data is not uniform, the value of a certain characteristic is very large, the calculated distance mainly depends on the characteristic with the large characteristic value, and the distance is hardly influenced by other characteristics with small values, which is often inconsistent with the actual situation. Therefore, when the dimensions in the data set are inconsistent, the data are normalized before the KNN classifier is used, so that the dimensions of the features are uniform.
Figure BDA0003895338040000071
As can be seen from the above description, an embodiment of the present invention provides a user classification method based on a KNN algorithm, including: firstly, receiving personal information of a user; and then classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
Compared with an SVM (support vector machine) algorithm, the KNN classification algorithm adopted by the invention has lower training time complexity, no hypothesis on data distribution, high classification accuracy and insensitivity to abnormal points, and is relatively suitable for classification conditions with larger sample capacity.
Referring to fig. 6, the present invention further provides a specific implementation of the user classification method based on the KNN algorithm: firstly, the invention also provides a user classification system based on the KNN algorithm, which comprises the following modules:
1) A data input module: selecting related data of the loan of the middle-line customer as a data source of the algorithm, and selecting an optimal parameter K value to input according to a cross-validation method;
2) A data preprocessing module: carrying out standardization processing on input data;
3) A model construction module: and constructing a KNN classification model according to the algorithm process.
4) Training the testing and predicting module: and dividing the data set into a training set and a testing set, substituting the training set and the testing set into the model for training and testing, and predicting new data on the premise of ensuring good classification effect.
The KNN algorithm is also called as a K-nearest neighbor algorithm, and the core idea of the algorithm with simple concept and excellent classification effect is as follows: to determine which class the test sample belongs to, the first K samples "closest" to the test sample are found from all the training samples, and then the test sample belongs to which class according to which class most of the K samples belong. Collecting the loan data of the clients of the bank, wherein the loan data comprises a plurality of variables such as personal basic information of the clients, annual income, liability rate, credit card liability, other liability, loan amount and loan state, and establishing a KNN classification model based on the data set: the loan status is set as a dependent variable and the rest as independent variables, and the data set is divided into a training set and a testing set in a specific proportion (e.g., 6. After training and testing are completed, the model can be used for classifying and predicting new data, and therefore the technical problem related to the patent is solved.
S1: collecting data related to bank customer loan;
specifically, collecting data information related to loan of a bank customer, and preprocessing sample data: including filtering out fields that have an excessive data loss rate or are not relevant to loan status prediction, performing data deduplication based on customer identity, supplementing missing values based on actual business, and so forth.
S2: carrying out necessary preprocessing or standardization processing on the sample data;
s3: constructing a KNN classification model;
substituting the training set into the KNN model, training the model: the KNN model can be built using the KNeighborsClassifier function in the sklern library, wherein the KNeighborsClassifier function refers to the more common parameter meanings explained below:
n _ neighbors: namely K, the default number of the neighbor is 5;
weight: the predicted weight function is a probability value;
algorithm: the set algorithm for calculating the nearest neighbor has the optional values of auto, ball _ tree, kd _ tree and brute, the default is auto, and the algorithm represents that a proper algorithm is automatically selected;
p: p =1 indicates the use of manhattan distance, p =2 indicates the use of euclidean distance;
then, the fit (X _ train, y _ train) function is called again to train the model.
S4: the model is applied to the sample data for training, testing, and prediction.
Specifically, the test set data is used to evaluate the predictive effect of the trained KNN model. The prediction test set data can be realized by calling a predict function, and the accuracy of model prediction can be measured by a score function.
As can be seen from the above description, an embodiment of the present invention provides a user classification method based on a KNN algorithm, including: firstly, receiving personal information of a user; and then classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
Compared with an SVM (support vector machine) algorithm, the KNN classification algorithm adopted by the invention has lower training time complexity, no hypothesis on data distribution, high classification accuracy and insensitivity to abnormal points, and is relatively suitable for classification conditions with larger sample capacity.
Based on the same inventive concept, the embodiment of the present application further provides a user classification device based on the KNN algorithm, which can be used to implement the method described in the above embodiment, such as the following embodiments. Because the principle of solving the problems of the user classification device based on the KNN algorithm is similar to the user classification method based on the KNN algorithm, the implementation of the user classification device based on the KNN algorithm can be realized by referring to the implementation of the user classification method based on the KNN algorithm, and repeated parts are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
An embodiment of the present invention provides a specific implementation manner of a user classification device based on a KNN algorithm, which is capable of implementing a user classification method based on a KNN algorithm, and referring to fig. 7, the user classification device based on the KNN algorithm specifically includes the following contents:
a personal information receiving module 10 for receiving personal information of a user;
and the user classification module 20 is configured to classify a plurality of users according to the personal information and a pre-established KNN user classification model, where the KNN user classification model is generated based on a KNN algorithm.
In an embodiment, referring to fig. 8, the user classification apparatus based on the KNN algorithm further includes: a classification model establishing module 30, configured to establish the KNN user classification model, referring to fig. 9, where the classification model establishing module 30 includes:
an initial model establishing module 301, configured to establish an initial model of the KNN user classification model based on customer data collected in a previous stage by using a kneighbors classifier function in a Python machine learning skleern library;
a classification model establishing unit 302, configured to train the initial model through a fit (X _ train, y _ train) function in a skleann library to generate the KNN user classification model.
In an embodiment, referring to fig. 10, the user classifying device based on the KNN algorithm further includes:
a data set dividing module 40, configured to randomly divide the data set into a training set and a test set by using a train _ test _ split function in a machine learning library sklean.
In an embodiment, referring to fig. 11, the user classifying device based on the KNN algorithm further includes:
an information preprocessing module 50, configured to preprocess the personal information;
and a training set normalization module 60, configured to perform normalization processing on the training set.
As can be seen from the above description, an embodiment of the present invention provides a user classification device based on a KNN algorithm, including: firstly, receiving personal information of a user; and then classifying the plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm. The KNN classification algorithm adopted by the invention is simple and easy to use, the model training time is short, the established model structure is completely determined according to data, the method accords with the practical situation, is more suitable for the classification situation with larger sample capacity, and is insensitive to abnormal values, so the method has high applicability to the classification of bank loan data, has good prediction effect, and can provide credible reference for the prediction and control of loan default risks.
An embodiment of the present application further provides a specific implementation manner of an electronic device that can implement all steps in the user classification method based on the KNN algorithm in the foregoing embodiment, and referring to fig. 12, the electronic device specifically includes the following contents:
a processor (processor) 1201, a memory (memory) 1202, a communication Interface 1203, and a communication bus 1204;
the processor 1201, the memory 1202 and the communication interface 1203 complete mutual communication through the communication bus 1204; the communication interface 1203 is used for implementing information transmission between related devices such as server-side devices and client-side devices;
the processor 1201 is configured to call the computer program in the memory 1202, and the processor executes the computer program to implement all the steps in the KNN algorithm-based user classification method in the above embodiments, for example, the processor executes the computer program to implement the following steps:
step 100: receiving personal information of a user;
step 200: and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
Embodiments of the present application further provide a computer-readable storage medium capable of implementing all steps of a KNN algorithm-based user classification method in the foregoing embodiments, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all steps of a KNN algorithm-based user classification method in the foregoing embodiments, for example, when the processor executes the computer program, the processor implements the following steps:
step 100: receiving personal information of a user;
step 200: and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and reference may be made to part of the description of the method embodiment for relevant points.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Although the present application provides method steps as in embodiments or flowcharts, additional or fewer steps may be included based on routine or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or client product executes, it may execute sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
For convenience of description, the above devices are described as being divided into various modules by functions, which are described separately. Of course, when implementing the embodiments of the present specification, the functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of multiple sub-modules or sub-units, or the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
The embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The described embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The above description is only an example of the embodiments of the present disclosure, and is not intended to limit the embodiments of the present disclosure. Various modifications and alterations to the embodiments described herein will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present specification should be included in the scope of the claims of the embodiments of the present specification.

Claims (11)

1. A user classification method based on a KNN algorithm is characterized by comprising the following steps:
receiving personal information of a user;
and classifying a plurality of users according to the personal information and a pre-established KNN user classification model, wherein the KNN user classification model is generated based on a KNN algorithm.
2. The KNN algorithm-based user classification method according to claim 1, wherein the method of building the KNN user classification model includes:
establishing an initial model of the KNN user classification model based on customer data collected in the early stage by using a KNeighborsClassifier function in a Python machine learning skleern library;
training the initial model through a fit (X _ train, y _ train) function in a skleann library to generate the KNN user classification model.
3. The KNN algorithm-based user classification method of claim 1, further comprising:
the data set is randomly divided into a training set and a test set using a rain test split function.
4. The KNN-algorithm-based user classification method of claim 3, further comprising:
preprocessing the personal information;
and carrying out normalization processing on the training set.
5. A KNN algorithm-based user classification device is characterized by comprising:
the personal information receiving module is used for receiving personal information of a user;
and the user classification module is used for classifying a plurality of users according to the personal information and a pre-established KNN user classification model, and the KNN user classification model is generated based on a KNN algorithm.
6. The KNN-algorithm-based user classification apparatus of claim 5, further comprising: the classification model establishing module is used for establishing the KNN user classification model, and the classification model establishing module comprises:
the initial model establishing module is used for establishing an initial model of the KNN user classification model based on customer data collected in the early stage by utilizing a KNeighborsClassifier function in a Python machine learning skleann library;
a classification model establishing unit, configured to train the initial model through a fit (X _ train, y _ train) function in a skleann library, so as to generate the KNN user classification model.
7. The KNN-algorithm-based user classification device of claim 5, further comprising:
and the data set dividing module is used for randomly dividing the data set into a training set and a test set by using a train _ test _ split function in a machine learning library skleann.
8. The KNN algorithm-based user classification device of claim 7, further comprising:
the information preprocessing module is used for preprocessing the personal information;
and the training set normalization module is used for performing normalization processing on the training set.
9. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the KNN algorithm based user classification method of any one of claims 1 to 4.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of a KNN algorithm based user classification method as claimed in any one of claims 1 to 4.
11. A computer-readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the KNN algorithm-based user classification method of any one of claims 1 to 4.
CN202211271947.7A 2022-10-18 2022-10-18 KNN algorithm-based user classification method and device Pending CN115545113A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211271947.7A CN115545113A (en) 2022-10-18 2022-10-18 KNN algorithm-based user classification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211271947.7A CN115545113A (en) 2022-10-18 2022-10-18 KNN algorithm-based user classification method and device

Publications (1)

Publication Number Publication Date
CN115545113A true CN115545113A (en) 2022-12-30

Family

ID=84735067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211271947.7A Pending CN115545113A (en) 2022-10-18 2022-10-18 KNN algorithm-based user classification method and device

Country Status (1)

Country Link
CN (1) CN115545113A (en)

Similar Documents

Publication Publication Date Title
US20210287048A1 (en) System and method for efficient generation of machine-learning models
Teinemaa et al. Temporal stability in predictive process monitoring
Hu A multivariate grey prediction model with grey relational analysis for bankruptcy prediction problems
Sobolewski et al. Concept Drift Detection and Model Selection with Simulated Recurrence and Ensembles of Statistical Detectors.
US10303737B2 (en) Data analysis computer system and method for fast discovery of multiple Markov boundaries
Cardoso et al. Financial credit analysis via a clustering weightless neural classifier
CN110852881B (en) Risk account identification method and device, electronic equipment and medium
CN110097098A (en) Data classification method and device, medium and electronic equipment based on base classifier
Nalić et al. Importance of data pre-processing in credit scoring models based on data mining approaches
CN113052225A (en) Alarm convergence method and device based on clustering algorithm and time sequence association rule
CN113032367A (en) Dynamic load scene-oriented cross-layer configuration parameter collaborative tuning method and system for big data system
Verma et al. Feature selection
Helder et al. Application of the VNS heuristic for feature selection in credit scoring problems
He et al. Self-Adaptive bagging approach to credit rating
CN113472860A (en) Service resource allocation method and server under big data and digital environment
Pandey et al. Machine learning–based classifiers ensemble for credit risk assessment
CN110120082B (en) Image processing method, device and equipment for financial data and readable storage medium
Javadpour et al. Improving the efficiency of customer's credit rating with machine learning in big data cloud computing
CN116662876A (en) Multi-modal cognitive decision method, system, device, equipment and storage medium
CN115545113A (en) KNN algorithm-based user classification method and device
CN113159213A (en) Service distribution method, device and equipment
Ruud et al. A Comparative Study in Binary Classification for Loan Eligibility Prediction
Cai et al. Credit scoring using incremental learning algorithm for SVDD
Çelik et al. Resampling and Ensemble Strategies for Churn Prediction
Mundra et al. Analyzing credit defaulter behavior for precise credit scoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination