CN112950359A - User identification method and device - Google Patents

User identification method and device Download PDF

Info

Publication number
CN112950359A
CN112950359A CN202110338740.6A CN202110338740A CN112950359A CN 112950359 A CN112950359 A CN 112950359A CN 202110338740 A CN202110338740 A CN 202110338740A CN 112950359 A CN112950359 A CN 112950359A
Authority
CN
China
Prior art keywords
data
user
preset
user data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110338740.6A
Other languages
Chinese (zh)
Other versions
CN112950359B (en
Inventor
邓强
史博慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202110338740.6A priority Critical patent/CN112950359B/en
Publication of CN112950359A publication Critical patent/CN112950359A/en
Application granted granted Critical
Publication of CN112950359B publication Critical patent/CN112950359B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The invention discloses a user identification method and device, and relates to the technical field of big data. One specific implementation mode of the method comprises the steps of acquiring user data from a data warehouse by calling a preset data interface; determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform. Therefore, the method and the device for recommending the house loan products can solve the problems that the recommendation accuracy of the existing house loan products is low, the user experience is poor, and manpower and material resources are consumed.

Description

User identification method and device
Technical Field
The invention relates to the technical field of big data, in particular to the field of data analysis and mining, and particularly relates to a user identification method and device.
Background
In recent years, with economic development, the proportion of residential housing consumption of residents is obviously increased, and the demand of personal housing loan is vigorous. In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: at present, people's experience is completely relied on to recommend house loan products to a plurality of users in a large range, so that the precision is low, the user experience is poor, and manpower and material resources are consumed.
Disclosure of Invention
In view of this, embodiments of the present invention provide a user identification method and device, which can solve the problems of low recommendation accuracy, poor user experience, and manpower and material resource consumption of existing housing loan products.
In order to achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a user identification method, including obtaining user data from a data warehouse by calling a preset data interface; determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
Optionally, after acquiring the user data from the data warehouse by calling a preset data interface, the method includes:
and calling a preset screening component, and filtering the acquired user data to generate the user data to be identified.
Optionally, determining the guest group according to a preset guest group classification component includes:
sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
Optionally, acquiring the historical data of the user includes:
and acquiring corresponding configuration information according to the identification model, and further calculating corresponding statistical variables in each window for different types of configuration information through a preset time window.
Optionally, comprising:
calling an evaluation engine of the recognition model, and processing the recognition model corresponding to each customer group according to a preset evaluation model and a stability verification model;
and triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model.
Optionally, the method further comprises:
based on the lightGBM model, machine learning is respectively carried out on various passenger groups so as to train and obtain corresponding recognition models, and then a recognition model engine is constructed.
In addition, the invention also provides a user identification device, which comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring user data from a data warehouse by calling the preset data interface; the processing module is used for determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
Optionally, the obtaining module, after obtaining the user data from the data warehouse by calling a preset data interface, includes:
and calling a preset screening component, and filtering the acquired user data to generate the user data to be identified.
Optionally, the determining, by the processing module, the guest group according to a preset guest group classification component includes:
sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
Optionally, the processing module obtains the historical data of the user, and includes:
and acquiring corresponding configuration information according to the identification model, and further calculating corresponding statistical variables in each window for different types of configuration information through a preset time window.
Optionally, the method further comprises:
the monitoring module is used for calling an evaluation engine of the recognition model and processing the recognition model corresponding to each customer group according to a preset evaluation model and a stability verification model; and triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model.
Optionally, the processing module is further configured to:
based on the lightGBM model, machine learning is respectively carried out on various passenger groups so as to train and obtain corresponding recognition models, and then a recognition model engine is constructed.
One embodiment of the above invention has the following advantages or benefits: according to the invention, based on user data such as gender, age level, education degree, occupational attributes and local area of a client, a recognition model is established by utilizing multi-dimensional historical data such as transaction flow of the client in a bank, financial assets and the like, a broadcast marketing target group is subdivided, a target group with high room loan demand and high room loan transaction possibility is selected and pushed to a third-party marketing platform for release, accurate positioning of target consumers is facilitated, the range of the marketing group is narrowed, marketing information is accurately released to the eyes of the target consumers, the probability of marketing success is improved, marketing cost is reduced, customer obtaining efficiency is improved, and the maximized marketing effect is obtained.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic diagram of a main flow of a user identification method according to a first embodiment of the present invention;
figure 2 is a schematic diagram of PSI parameters according to an embodiment of the invention;
fig. 3 is a schematic diagram of a main flow of a user identification method according to a second embodiment of the present invention;
FIG. 4 is a schematic diagram of a main flow of identification model construction according to an embodiment of the invention;
fig. 5 is a schematic diagram of main blocks of a user identification device according to a first embodiment of the present invention;
fig. 6 is a schematic diagram of main blocks of a user identification device according to a second embodiment of the present invention;
FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a main flow of a user identification method according to a first embodiment of the present invention, as shown in fig. 1, the user identification method includes:
and step S101, acquiring user data from a data warehouse by calling a preset data interface.
In an embodiment, after performing step S101, a preset filtering component may be called to filter the acquired user data to generate user data to be identified. That is, the screening component may be configured in advance according to different service requirements, so as to screen out the clients that do not satisfy the service admission condition, for example, to screen out the clients that do not satisfy the house credit service admission condition.
And S102, determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group.
In an embodiment, the specific implementation process includes the following steps: sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
That is, a data request is sent to the credit investigation database server, if the credit investigation data, i.e. the first target attribute value exists in the received processing result, the first client group (e.g. credit investigation client group) is determined to belong to, and if the credit investigation data does not exist in the received processing result, the stored second target attribute table (e.g. surrogate payroll data table) is called. If the client has the surname payroll data, the client belongs to a second client group (such as surname payroll group), and if the client does not have the surname payroll data, the client belongs to a third client group (such as a general client group).
Step S103, calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, further calculating a grading result of the user according to the historical data, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
In an embodiment, when the historical data of the user is obtained, corresponding configuration information may be obtained according to the identification model, and then corresponding statistical variables in each window are calculated for different types of configuration information through a preset time window. That is, different identification models are corresponding to different belonging guest groups, and therefore configuration information (i.e., entries) of the identification models are different, so that corresponding configuration information of the corresponding identification model is obtained first, and corresponding statistical variables in the windows are calculated for the different types of configuration information through a preset time window according to the different types of the configuration information. For example: and calculating summation, mean, quantile, minimum value, maximum value, standard deviation, variation coefficient and the like for the numerical field in the configuration information through a time window. The number of occurrences of the main type, the kind of the occurrence type, and the like are calculated through a time window for the classification type variables in the configuration information.
The method includes performing machine learning on various guest groups based on a lightGBM model to obtain corresponding recognition models through training, and then constructing a recognition model engine. That is to say, in the embodiment of the present invention, based on the lightGBM model, machine learning is performed through the historical data of different guest groups to train and obtain a corresponding recognition model, so as to construct a recognition model engine. The LigthGBM algorithm is a set algorithm and is an efficient implementation of the GBDT algorithm, the GBDT algorithm has the main idea that a weak classifier (decision tree) is used for iterative training to obtain an optimal model, and the LigthGBM algorithm adopts a negative gradient of a loss function as a residual error approximate value of a current decision tree to fit a new decision tree.
As another embodiment, the present invention may further invoke an evaluation engine of the recognition model, and process the recognition model corresponding to each guest group according to a preset evaluation model and a stability verification model. And triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model. Preferably, the preset evaluation model mainly has Gini and KS, which are indexes for measuring the discrimination of the model, and the model with poor discrimination index can be optimized by parameter adjustment. It is worth noting that only training data may be used for tuning parameters, and test data is not allowed. Among them, Gini is used to evaluate how well the scoring card model can be differentiated. KS is often used to measure the degree of discrimination of the model between positive and negative examples.
In addition, the stability verification model is measured by a measurement unit PSI, and the test set and the training set can be divided according to an equal quantile section or equal quantile (namely, the proportion of each group of samples is the same). Calculating the proportion of each training sample group in the total training samples, the proportion of each testing sample group in the total testing samples, and finally adding: ln (% training/testing%) (training% -testing%). The PSI reference may be as shown in figure 2.
It is worth explaining that the method can also evaluate the service effect of the recognition model, and can well grasp the influence of the recognition model on future services through analyzing the service effect of the model. Cut-off is set under the score based on the modeling data so that the housing credit customer proportion of the customer base above cut-off is higher.
In still other embodiments, based on the constructed accurate house loan marketing model, the model results of the users after batch screening are obtained to obtain the identification model score of each user, and the possibility of handling house loan of the user is evaluated. According to the invention, cut-off is preset, and the user data with the scoring result meeting the preset matching condition (namely the scoring result above the cut-off) is pushed to a third-party marketing platform for accurate marketing.
In conclusion, the invention has the advantages that the house loan accurate marketing reduces the marketing range, greatly reduces the cost consumed by the marketing, avoids the phenomenon that invalid marketing information easily causes the user to feel dislike, and improves the marketing efficiency. The marketing is only carried out on the clients with the room loan demands at a large probability, excessive irrelevant clients are not disturbed, the marketing related to the room loan of the clients without the room loan demands is avoided, the emotional feeling of the clients which is dislike is reduced, and the loss of part of potential clients is avoided. In addition, the accurate house loan marketing model is established based on multi-dimensional data by using the machine learning technology of LightGbm, so that customers with high house loan demand possibility can be predicted, the prediction accuracy is high, the error rate is low, and the marketing strategy is formulated in a targeted manner.
Fig. 3 is a schematic main flow chart of a user identification method according to a second embodiment of the present invention, and as shown in fig. 3, the user identification method may include: and acquiring user data from the data warehouse by calling a preset data interface. And then calling a preset screening component to filter the acquired user data (namely screening the client) so as to generate the user data to be identified. For example: screening out the customers who do not satisfy the admission condition of the room credit service, the preset screening component can be: the age of the client is 18-65 years, the client becomes a client in line for at least half a year, the current client does not transact house loan business (including application process and released business), the client is not blacklisted, the loan and credit card are normal at the current time, the internal loan and the historical repayment of the credit card are not more than 6 and no overdue record of more than 90 days, the loan and the historical repayment of the credit card are not more than 6 and no overdue record of more than 90 days in the credit report, and the like.
Based on the user data, sending a data request to a credit investigation database server, further judging whether a received processing result has a first target attribute value (for example, credit investigation data exists within 3 months of the current time point), and if so, determining the credit investigation passenger group to which the credit investigation passenger group belongs; if not, calling a stored second target attribute table (such as a surreptitious payroll data table), judging whether the user data exists, if so, determining the surreptitious payroll group, and if not, determining the general payroll group. And matching identification models (such as credit investigation customer group models, commission payroll customer group models and general customer group models) corresponding to the customer groups, calling an identification model engine, selecting data acquisition time points to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result is greater than or equal to a preset grading threshold value to a third-party marketing platform. Preferably, the in-mold features are processed, e.g., feature derived, before the recognition model engine is invoked.
In the embodiment of the invention, for three different training test sets based on the lightGBM model, a credit investigation passenger group identification model, a commission payroll passenger group identification model and a general passenger group identification model are respectively constructed by utilizing the screened features, the feature weight is used according to the identification model, the features which do not enter the model, are low in model weight and cannot be explained are deleted, and the iterative modeling is carried out again, so that the features and the differences of the features of different customer groups are learned, and more accurate judgment can be made on the customer application for the house loan. As shown in fig. 4, an object Y needs to be defined for the recognition model, and the house loan requirement client (Y ═ 1): the personal credit information of the client and the spouse thereof shows that the client transacts 'personal housing public deposit loan' or 'personal housing loan' in the latter half of the observation time or the client establishes a commercial housing loan or public deposit loan which the client passes examination and approval in the latter half of the observation time. Meanwhile, personal credit information of the client and the spouse thereof shows that the client does not apply for 'personal housing public deposit loan' and 'personal housing loan' before the observation time point, and the client does not approve commercial housing loan or public deposit loan before the observation time point. Non-lending needs customer (y ═ 0): the client and the spouse (or no spouse) have credit investigation reports after the observation time point for half a year, and the client and the spouse before the observation time point display no 'personal housing public deposit loan' or 'personal housing loan' in the personal credit investigation information of the client and the spouse and the client establishes a commercial housing loan or public deposit loan which is not approved before the observation time point.
Also, to create sample sets for three customer groups, respectively, to reduce seasonal effects of the house loan service, training samples and test samples are extracted by selecting the beginning of a month as observation time points every quarter, 4 observation time points in total, 3 ten thousand sample users are extracted at each observation time point (25000 and 5000 users with y being 0 and y being 1 are randomly extracted in a ratio of 1: 5), and 12 general users of each customer group are used as model training test samples (where the training set and the test set are divided into 7: 3). Meanwhile, for verifying the model effect, verification samples are respectively established for three passenger groups, the verification samples are extracted by selecting one month and one month as observation time points in each quarter, the total number of the observation time points is 4, and the total number of 100 ten thousand samples of 25 ten thousand sample users are randomly extracted at each observation time point to be used as verification samples. In addition, various types of original data in the customer line, including customer basic information, debit card contract information, debit card account flow, AUM information and other multi-dimensional data user data, are acquired for the user of the whole sample set.
It should be noted that the characteristic derivation of the sample specifically includes: and when each sample only has a single record, such as age, academic calendar and the like, the multi-value aggregation of the characteristic is not needed, and the multi-value record can be directly transmitted as the in-mode characteristic. For the typing variable, the unusual types are usually combined, such as student majors, company types, professions, and the like, and then are transmitted. The features may also be statistically aggregated: for inline AUM, transaction information, issuance records, etc., each user may have multiple running records that occur at different times. Data such as the details of the loan of the pedestrian, the address of the house of the pedestrian, professional information, and public accumulation information can be regarded as stream data. For such pipeline features, different time windows are usually divided to calculate statistical variables within each window, such as calculating sums, means, quantiles, minimum values, maximum values, standard deviations, coefficients of variation, and the like for the numeric field. In addition, cross-derivation of features is also possible: the number of occurrences of the main type, the kind of the occurrence type, and the like are calculated for the classification type variables. And performing cross operation on the two derived characteristics to obtain a new characteristic, such as calculating AUM ring ratio increase month number/AUM month number of the last 6 months.
And then, performing feature screening on the derived samples, namely performing feature coverage and IV (information value) calculation on the full amount of samples (development training set + development testing set) without passenger groups to evaluate the effectiveness of the features on the prediction target. And calculating PSI of the features between the development training set and the development testing set of the customer-classifying group for evaluating the stability of the features, wherein IV and PSI are interpreted as follows:
IV:
less than 0.02: with little discrimination force
0.02-0.1: the distinguishing force is weaker
0.1-0.3: medium area force
Greater than 0.3: strong differentiating force
PSI:
Less than 0.1: the distribution is not changed greatly
0.1-0.25: small variation in distribution
Greater than 0.25: large variation of distribution
The coverage and IV of the derived features vary widely: for features with particularly low coverage, even if iv is a high value, it is difficult to determine whether accidental, and its generalization capability is limited; the distribution of the characteristics with higher PSI on the training set and the testing set is very different, which shows that the distribution is not stable enough, and the characteristics are difficult to construct a stable model with good generalization capability; an IV value that is too low means that the features have a low correlation to the predicted target and placement in the model only adds complexity and noise. Therefore, modeling needs to be performed after the features are preliminarily screened according to the indexes.
It is worth to be noted that the method can also carry out model evaluation, parameter adjustment and model verification on the constructed model. Specifically, an evaluation engine of the recognition model is called, and the recognition model corresponding to each guest group is processed according to a preset evaluation model and a preset stability verification model. And triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model. Preferably, the preset evaluation model mainly has Gini and KS, which are indexes for measuring the discrimination of the model, and the model with poor discrimination index can be optimized by parameter adjustment. It is worth noting that only training data may be used for tuning parameters, and test data is not allowed. In addition, the stability verification model is measured by a measuring unit PSI, the test set and the training set are divided into 1-20 groups according to scores, and the division can be carried out according to an equal score section or equal score (namely, the proportion of samples in each group is the same). Calculating the proportion of each training sample group in the total training samples, the proportion of each testing sample group in the total testing samples, and finally adding: ln (% training/testing%) (training% -testing%).
Fig. 5 is a schematic diagram of main modules of a user identification device according to an embodiment of the present invention, and as shown in fig. 5, the user identification device includes an acquisition module 501 and a processing module 502. The obtaining module 501 obtains user data from a data warehouse by calling a preset data interface; the processing module 502 determines the belonging guest group according to a preset guest group classification component based on the user data to match the identification model corresponding to the belonging guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
In some embodiments, after the obtaining module 501 obtains the user data from the data warehouse by calling a preset data interface, the obtaining module includes:
and calling a preset screening component, and filtering the acquired user data to generate the user data to be identified.
In some embodiments, the processing module 502 determines the guest group according to a preset guest group classification component, including:
sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
In some embodiments, the processing module 502 obtains the historical data of the user, including:
and acquiring corresponding configuration information according to the identification model, and further calculating corresponding statistical variables in each window for different types of configuration information through a preset time window.
In some embodiments, the processing module 502 is further configured to:
based on the lightGBM model, machine learning is respectively carried out on various passenger groups so as to train and obtain corresponding recognition models, and then a recognition model engine is constructed.
As another embodiment of the present invention, as shown in fig. 6, the user identification apparatus includes an obtaining module 501, a processing module 502, and a monitoring module 503. The obtaining module 501 obtains user data from a data warehouse by calling a preset data interface; the processing module 502 determines the belonging guest group according to a preset guest group classification component based on the user data to match the identification model corresponding to the belonging guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform. The monitoring module 503 calls an evaluation engine of the recognition model, and processes the recognition model corresponding to each guest group according to a preset evaluation model and a stability verification model; and triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model.
It should be noted that, the user identification method and the user identification apparatus of the present invention have corresponding relationships in the specific implementation contents, and therefore, the repeated contents are not described again.
Fig. 7 shows an exemplary system architecture 700 to which the user identification method or user identification apparatus of an embodiment of the invention may be applied.
As shown in fig. 7, the system architecture 700 may include terminal devices 701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the terminal devices 701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The terminal devices 701, 702, 703 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).
The terminal devices 701, 702, 703 may be various electronic devices having a user identification screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 705 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 701, 702, 703. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the user identification method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the computing device is generally disposed in the server 705.
It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the computer system 800 are also stored. The CPU801, ROM802, and RAM803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a display such as a Cathode Ray Tube (CRT), a liquid crystal user identifier (LCD), and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module and a processing module. Wherein the names of the modules do not in some cases constitute a limitation of the module itself.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include obtaining user data from a data repository by invoking a predetermined data interface; determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
According to the technical scheme of the embodiment of the invention, the problems of low recommendation precision, poor user experience and consumption of manpower and material resources of the existing house loan products can be solved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (14)

1. A method for identifying a user, comprising:
acquiring user data from a data warehouse by calling a preset data interface;
determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group;
and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
2. The method of claim 1, wherein after retrieving the user data from the data repository by invoking a predetermined data interface, the method comprises:
and calling a preset screening component, and filtering the acquired user data to generate the user data to be identified.
3. The method of claim 1, wherein determining the guest group according to a preset guest group classification component comprises:
sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
4. The method of claim 1, wherein obtaining historical data of the user comprises:
and acquiring corresponding configuration information according to the identification model, and further calculating corresponding statistical variables in each window for different types of configuration information through a preset time window.
5. The method of claim 1, comprising:
calling an evaluation engine of the recognition model, and processing the recognition model corresponding to each customer group according to a preset evaluation model and a stability verification model;
and triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model.
6. The method of any of claims 1-5, further comprising:
based on the lightGBM model, machine learning is respectively carried out on various passenger groups so as to train and obtain corresponding recognition models, and then a recognition model engine is constructed.
7. A user identification device, comprising:
the acquisition module is used for acquiring user data from the data warehouse by calling a preset data interface;
the processing module is used for determining the belonged guest group according to a preset guest group classification component based on the user data so as to match the identification model corresponding to the belonged guest group; and calling an identification model engine, selecting a data acquisition time point to acquire historical data of the user, calculating according to the historical data to obtain a grading result of the user, and pushing the user data of which the grading result meets a preset matching condition to a third-party marketing platform.
8. The apparatus of claim 7, wherein the obtaining module, after obtaining the user data from the data warehouse by calling a preset data interface, comprises:
and calling a preset screening component, and filtering the acquired user data to generate the user data to be identified.
9. The apparatus of claim 7, wherein the processing module determines the guest group according to a preset guest group classification component, comprising:
sending a data request to a credit investigation database server according to the user data, further judging whether a received processing result has a first target attribute value, and if so, determining a first passenger group to which the processing result belongs; if not, calling a stored second target attribute table, judging whether the user data exists, if so, determining the second guest group to which the user data belongs, and if not, determining the third guest group to which the user data belongs.
10. The apparatus of claim 7, wherein the processing module obtains historical data of the user, comprising:
and acquiring corresponding configuration information according to the identification model, and further calculating corresponding statistical variables in each window for different types of configuration information through a preset time window.
11. The apparatus of claim 7, further comprising:
the monitoring module is used for calling an evaluation engine of the recognition model and processing the recognition model corresponding to each customer group according to a preset evaluation model and a stability verification model; and triggering a model training program when the monitored processing result does not accord with the preset target condition so as to adjust the parameters of the recognition model.
12. The apparatus of any of claims 7-11, wherein the processing module is further configured to:
based on the lightGBM model, machine learning is respectively carried out on various passenger groups so as to train and obtain corresponding recognition models, and then a recognition model engine is constructed.
13. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
14. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202110338740.6A 2021-03-30 2021-03-30 User identification method and device Active CN112950359B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110338740.6A CN112950359B (en) 2021-03-30 2021-03-30 User identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110338740.6A CN112950359B (en) 2021-03-30 2021-03-30 User identification method and device

Publications (2)

Publication Number Publication Date
CN112950359A true CN112950359A (en) 2021-06-11
CN112950359B CN112950359B (en) 2022-06-28

Family

ID=76227473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110338740.6A Active CN112950359B (en) 2021-03-30 2021-03-30 User identification method and device

Country Status (1)

Country Link
CN (1) CN112950359B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116468265A (en) * 2023-03-23 2023-07-21 杭州瓴羊智能服务有限公司 Batch user data processing method and device
WO2023236588A1 (en) * 2022-06-06 2023-12-14 上海淇玥信息技术有限公司 User classification method and apparatus based on deviation smoothing optimization for customer groups

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600369A (en) * 2016-12-09 2017-04-26 广东奡风科技股份有限公司 Real-time recommendation system and method of financial products of banks based on Naive Bayesian classification
CN108765094A (en) * 2018-06-06 2018-11-06 中国平安人寿保险股份有限公司 Insurance data processing method, device, computer equipment and storage medium
CN109598535A (en) * 2018-11-05 2019-04-09 宁波大红鹰学院 It is a kind of based on big data to the method and system of distributed photovoltaic client segmentation
CN109670848A (en) * 2018-09-11 2019-04-23 深圳平安财富宝投资咨询有限公司 Customer segmentation method, user equipment, storage medium and device based on big data
CN111563628A (en) * 2020-05-09 2020-08-21 重庆锐云科技有限公司 Real estate customer transaction time prediction method, device and storage medium
CN111612519A (en) * 2020-04-13 2020-09-01 广发证券股份有限公司 Method, device and storage medium for identifying potential customers of financial product
CN111626766A (en) * 2020-04-23 2020-09-04 深圳索信达数据技术有限公司 Mobile banking marketing customer screening method integrating multiple machine learning models
CN112053237A (en) * 2020-09-30 2020-12-08 中国银行股份有限公司 Method, device and equipment for identifying business information of public customers by bank
CN112464094A (en) * 2020-11-30 2021-03-09 泰康保险集团股份有限公司 Information recommendation method and device, electronic equipment and storage medium
CN112541817A (en) * 2020-12-22 2021-03-23 建信金融科技有限责任公司 Marketing response processing method and system for potential customers of personal consumption loan

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600369A (en) * 2016-12-09 2017-04-26 广东奡风科技股份有限公司 Real-time recommendation system and method of financial products of banks based on Naive Bayesian classification
CN108765094A (en) * 2018-06-06 2018-11-06 中国平安人寿保险股份有限公司 Insurance data processing method, device, computer equipment and storage medium
CN109670848A (en) * 2018-09-11 2019-04-23 深圳平安财富宝投资咨询有限公司 Customer segmentation method, user equipment, storage medium and device based on big data
CN109598535A (en) * 2018-11-05 2019-04-09 宁波大红鹰学院 It is a kind of based on big data to the method and system of distributed photovoltaic client segmentation
CN111612519A (en) * 2020-04-13 2020-09-01 广发证券股份有限公司 Method, device and storage medium for identifying potential customers of financial product
CN111626766A (en) * 2020-04-23 2020-09-04 深圳索信达数据技术有限公司 Mobile banking marketing customer screening method integrating multiple machine learning models
CN111563628A (en) * 2020-05-09 2020-08-21 重庆锐云科技有限公司 Real estate customer transaction time prediction method, device and storage medium
CN112053237A (en) * 2020-09-30 2020-12-08 中国银行股份有限公司 Method, device and equipment for identifying business information of public customers by bank
CN112464094A (en) * 2020-11-30 2021-03-09 泰康保险集团股份有限公司 Information recommendation method and device, electronic equipment and storage medium
CN112541817A (en) * 2020-12-22 2021-03-23 建信金融科技有限责任公司 Marketing response processing method and system for potential customers of personal consumption loan

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023236588A1 (en) * 2022-06-06 2023-12-14 上海淇玥信息技术有限公司 User classification method and apparatus based on deviation smoothing optimization for customer groups
CN116468265A (en) * 2023-03-23 2023-07-21 杭州瓴羊智能服务有限公司 Batch user data processing method and device

Also Published As

Publication number Publication date
CN112950359B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
CN112950359B (en) User identification method and device
CN114186626A (en) Abnormity detection method and device, electronic equipment and computer readable medium
CN114078050A (en) Loan overdue prediction method and device, electronic equipment and computer readable medium
CN111080178A (en) Risk monitoring method and device
CN113722433A (en) Information pushing method and device, electronic equipment and computer readable medium
CN111598360A (en) Service policy determination method and device and electronic equipment
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN110866698A (en) Device for assessing service score of service provider
CN112561685B (en) Customer classification method and device
CN111245815B (en) Data processing method and device, storage medium and electronic equipment
CN110197316B (en) Method and device for processing operation data, computer readable medium and electronic equipment
CN112990311A (en) Method and device for identifying admitted client
CN114092230A (en) Data processing method and device, electronic equipment and computer readable medium
CN114493851A (en) Risk processing method and device
CN112598499A (en) Method and device for determining credit limit
CN112734352A (en) Document auditing method and device based on data dimensionality
CN114880369A (en) Risk credit granting method and system based on weak data technology
CN111429257A (en) Transaction monitoring method and device
CN110895564A (en) Potential customer data processing method and device
CN112712270B (en) Information processing method, device, equipment and storage medium
CN111652501B (en) Financial product evaluation device and method, electronic equipment and storage medium
EP4138021A1 (en) Method of scoring and valuing data for exchange
TWI657393B (en) Marketing customer group prediction system and method
CN117974297A (en) Service processing method, device, electronic equipment and computer readable medium
CN113034270A (en) Credit risk early warning method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220926

Address after: 25 Financial Street, Xicheng District, Beijing 100033

Patentee after: CHINA CONSTRUCTION BANK Corp.

Address before: 12 / F, 15 / F, No. 99, Yincheng Road, Shanghai pilot Free Trade Zone, 200120

Patentee before: Jianxin Financial Science and Technology Co.,Ltd.

TR01 Transfer of patent right