CN114117247A - Information recommendation method and device, electronic equipment and storage medium - Google Patents

Information recommendation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114117247A
CN114117247A CN202111256727.2A CN202111256727A CN114117247A CN 114117247 A CN114117247 A CN 114117247A CN 202111256727 A CN202111256727 A CN 202111256727A CN 114117247 A CN114117247 A CN 114117247A
Authority
CN
China
Prior art keywords
data set
object data
target
resource management
objects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111256727.2A
Other languages
Chinese (zh)
Inventor
李鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dazhu Hangzhou Technology Co ltd
Original Assignee
Dazhu Hangzhou Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dazhu Hangzhou Technology Co ltd filed Critical Dazhu Hangzhou Technology Co ltd
Priority to CN202111256727.2A priority Critical patent/CN114117247A/en
Publication of CN114117247A publication Critical patent/CN114117247A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an information recommendation method and device, electronic equipment and a storage medium, wherein the method comprises the following steps: selecting a first object data set containing a plurality of first objects related to the target resource management event and at least one characteristic data corresponding to each first object from the target database, and selecting a second object data set which is similar to the first object data set and is not related to the target resource management event except the first object data set; training a target classification model by using the first object data set, the second object data set and the common characteristics of the first object data set and the second object data set to obtain a similar population expansion model; predicting a probability value of an object which is in the target database, is except for the first object and the second object and has common characteristics and is related to a target resource management event; and recommending resource information to the target object with the probability value larger than or equal to the first preset value. The method and the device solve the technical problem that the user requirements cannot be accurately identified due to the fact that the information recommendation and the client requirement matching degree is low in the related technology.

Description

Information recommendation method and device, electronic equipment and storage medium
Technical Field
The invention relates to the field of big data processing, in particular to an information recommendation method and device, electronic equipment and a storage medium.
Background
With the development and progress of the times, the improvement of the national living standard and the arousal of the residents on the financing consciousness, more and more people are willing to think about selecting the most suitable financing product. However, with the continuous participation of large commercial institutions/internet enterprises in the financial industry, various financial products are dazzling, and bring unprecedented competition for the traditional financial institution, bank.
At present, part of financial institutions such as banks and the like still stay in the traditional financial product sales promotion mode, and the recommended financial product form is single. For example, most of the forms are to recommend financial products for the client face to face, that is, sales related personnel recommend financial products which the client may purchase based on the knowledge and experience of the client; or a wide-broadcasting network mode is adopted to comprehensively promote newly released products without supporting reliable data, so that the requirements of customers on financial management products cannot be accurately grasped, the recommendation success rate is low, and the customers are lost; moreover, some current intelligent recommendation algorithms cannot well mine the commonalities of customer figures under the condition of small available data quantity, and the model training effect is poor.
Disclosure of Invention
In view of the above problems, the present invention provides an information recommendation method and apparatus, an electronic device, and a storage medium, so as to at least solve the technical problem in the related art that the user needs cannot be accurately identified due to low matching between information recommendation and the customer needs.
According to a first aspect of the present invention, there is provided an information recommendation method, including: selecting a first object data set containing a plurality of first objects associated with a target resource management event and at least one characteristic data corresponding to each first object from a target database, and selecting a second object data set which is similar to the first object data set except the first object data set, wherein a second object in the second object data set is not associated with the target resource management event; screening out common characteristics between the first object and other first objects from the first object data set by using a random forest model; training a target classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model; predicting, by the similar population expansion model, a probability value that an object in the target database, other than the first object and the second object, having the common characteristic is associated with the target resource management event; and recommending resource information associated with the target resource management event to the target object with the probability value greater than or equal to a first preset value.
Optionally, the feature data at least includes a basic feature, a financial behavior feature and a portrait description feature.
Optionally, before the screening out the common features between the first object and other first objects from the first object data set by using a random forest model, the method further includes: supplementing at least one feature data lacking a feature value in the first object data set with corresponding feature data; carrying out one-hot coding on discrete characteristic values corresponding to at least one characteristic data in the first object data set; and normalizing the characteristic value corresponding to at least one characteristic data in the first object data set.
Optionally, the training of the target classification model by using the first object data set, the second object data set, and the common features to obtain a similar population expansion model includes: identifying a seed object dataset and a non-seed object dataset in the first object dataset using the common features; and training the target classification model by taking the seed object data set and the non-seed object data set as positive samples and the second object data set as negative samples.
Optionally, the identifying the seed object data set and the non-seed object data set in the first object data set by using the common feature includes: clustering the first object using the commonality features and a clustering model to divide the first object data set into a seed object data set and a non-seed object data set; wherein a probability value of the first object in the seed object data set being associated with the target resource management event is greater than or equal to a second preset value, and a probability value of the first object in the non-seed object data set being associated with the target resource management event is less than the second preset value.
Optionally, after recommending resource information associated with the target resource management event to the target object with the probability value greater than or equal to the first preset value, the method further includes: selecting a third object associated with the target resource management event from the target objects; and if the number of the third objects is smaller than a third preset value, adding a third object data set corresponding to the third objects into the first object data set, and re-screening out common characteristics between the first objects and other first objects from the updated first object data set by using the random forest model.
Optionally, the target classification model is a support vector machine model.
According to a second aspect of the present invention, there is provided an information recommendation apparatus comprising: a first selecting module, configured to select, from a target database, a first object data set including first objects associated with a plurality of target resource management events and at least one feature data corresponding to each of the first objects, and select a second object data set that is similar to the first object data set except for the first object data set and is not associated with the target resource management events; a first screening module, configured to screen out common features between the first object and other first objects from the first object data set by using a random forest model; the training module is used for training a target classification model by utilizing the first object data set, the second object data set and the common characteristics to obtain a similar population expansion model; a prediction module, configured to predict, through the similar population expansion model, a probability value of an object, other than the first object and the second object, in the target database, and having the common characteristic, being associated with the target resource management event; and the recommending module is used for recommending the resource information associated with the target resource management event to the target object with the probability value larger than or equal to a first preset value.
Optionally, the feature data at least includes a basic feature, a financial behavior feature and a portrait description feature.
Optionally, before the first filtering module filters out the common features between the first object and other first objects from the first object data set by using a random forest model, the apparatus further includes: a supplementing module, configured to supplement at least one feature data lacking a feature value in the first object data set with corresponding feature data; the encoding module is used for carrying out one-hot encoding on discrete characteristic values corresponding to at least one characteristic data in the first object data set; and the normalization module is used for normalizing the characteristic value corresponding to at least one characteristic data in the first object data set.
Optionally, the training module includes: an identifying unit for identifying a seed object data set and a non-seed object data set in the first object data set using the common features; and the training unit is used for training the target classification model by taking the seed object data set and the non-seed object data set as positive samples and the second object data set as negative samples.
Optionally, the identification unit includes: a clustering subunit, configured to cluster the first object by using the common feature and a clustering model to divide the first object data set into a seed object data set and a non-seed object data set; wherein a probability value of the first object in the seed object data set being associated with the target resource management event is greater than or equal to a second preset value, and a probability value of the first object in the non-seed object data set being associated with the target resource management event is less than the second preset value.
Optionally, the apparatus further comprises: the second selection module is used for selecting a third object related to the target resource management event from the target objects after the recommendation module recommends the resource information related to the target resource management event to the target objects with the probability value larger than or equal to a first preset value; and the second screening module is used for adding a third object data set corresponding to the third object to the first object data set if the number of the third objects is smaller than a third preset value, and re-screening the common characteristics between the first object and other first objects from the updated first object data set by using the random forest model.
Optionally, the target classification model is a support vector machine model.
According to a third aspect of the present invention, there is also provided an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps in any of the above method embodiments.
According to a fourth aspect of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps in any of the apparatus embodiments described above when executed.
The information recommendation method provided by the embodiment of the invention selects a first object data set containing a plurality of first objects related to the target resource management event and at least one characteristic data corresponding to each first object from a target database, and selects a second object data set which is similar to the first object data set except the first object data set, wherein the second object in the second object data set is not related to the target resource management event; screening out the common characteristics between the first object and other first objects from the first object data set by using a random forest model, and better excavating the common characteristics among the user characteristics; training a random forest model and a classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model; predicting probability values of the objects which are in the target database, except the first object and the second object and have common characteristics, related to the target resource management events through a similar crowd extension model; resource information related to a target resource management event is recommended to a target object with a probability value larger than or equal to a first preset value, more potential clients with commonality are expanded under the condition of less data volume by using a similar population expansion model, personalized products are accurately pushed to the expanded potential clients, and user requirements are met, so that the technical problem that the user requirements cannot be accurately identified due to the fact that the information recommendation and the client requirement matching degree are low in the related technology is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly described below.
Fig. 1 is a block diagram of a hardware structure of a computer terminal to which an information recommendation method according to an embodiment of the present invention is applied;
fig. 2 is a flowchart of an information recommendation method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating the operation of information recommendation provided in accordance with an embodiment of the present invention;
fig. 4 is a block diagram illustrating an information recommendation apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that such uses are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to".
In order to solve the technical problems in the related art, an information recommendation method is provided in the present embodiment. The following describes the technical solution of the present invention and how to solve the above technical problems with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
The method provided by the embodiment of the invention can be executed in a mobile terminal, a server, a computer terminal or a similar operation device. Taking an example of being operated on a computer terminal, fig. 1 is a hardware structure block diagram of an information recommendation method applied to a computer terminal according to an embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the information recommendation method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory, and may also include volatile memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
Fig. 2 is a flowchart of an information recommendation method according to an embodiment of the present invention, and as shown in fig. 2, the flowchart includes the following steps:
step S202, a first object data set containing a plurality of first objects related to target resource management events and at least one characteristic data corresponding to each first object is selected from a target database, a second object data set which is similar to the first object data set except the first object data set is selected, and a second object in the second object data set is not related to the target resource management events;
preferably, the characteristic data includes at least a basic characteristic, a financial behavior characteristic, and a portrait description characteristic. Further, the basic characteristics are basic information of the user, such as the name of the customer, the home address level, the age, the sex, the academic calendar, the work type, and the like; the financial behavior characteristics are financial behavior information of the client, such as browsing behavior of the client for the last 10 times, behavior of purchasing financial products for the last 3 times, a timestamp for transferring the latest large amount of funds into the client, the latest large amount of funds into the client (more than 5 w) and the like; the representation description features are representation descriptions of the client, such as the client's financial status, risk preferences, risk tolerance, investment experience, investment style, and the like.
In one application scenario, a target asset management event is associated as a potential purchase of a particular financial product; unassociated target asset management events default to not purchasing the financial product.
Step S204, screening out common characteristics between the first object and other first objects from the first object data set by using a random forest model;
in an optional embodiment of the present disclosure, before the step S204, the method further includes: supplementing at least one characteristic data lacking the characteristic value in the first object data set with the corresponding characteristic data; carrying out one-hot coding on discrete characteristic values corresponding to at least one characteristic data in the first object data set; and normalizing the characteristic value corresponding to at least one characteristic data in the first object data set.
In a preferred example, the mode supplementation is used for discrete missing values in the first object data set and the median supplementation is used for continuous missing values in the first object data set.
For example, in the first object data set, the columns corresponding to the academic character in the basic character include junior high school, major, subject, graduate, doctor, etc., but the character value of the column of the academic character of some clients is null; if the mode of the characteristic of the study calendar is the university through counting the times of the occurrence of different study calendars of the characteristic value of the study calendar, such as 10 junior high school calendars, 20 high school calendars, 60 university calendars, 50 subjects, 20 researchers and 10 doctors, the client with the missing value in the study calendar is supplemented by the university calendars. Similarly, for the income feature of the first object data set, the median of all the income of the customers is counted, and for the customers whose income feature is empty (i.e. missing value), the median is used for supplementing the income feature.
In a preferred example, the discrete tags (i.e., the discrete features described above) are one-hot encoded.
In a preferred example, the eigenvalue normalization process uses the formula:
Figure BDA0003324083540000081
wherein X is the standardThe results after the conversion; x is the number ofiA value of the current feature data for the first object; x is the number ofmeanAverage value of all seed customers for the current characteristic data; x is the number ofstdAll seed customers' standard deviations for this list of features. For example: customer 1, income: 10000 yuan; customer 2, income: 15000 yuan; customer 3, income: 20000 yuan; then for client 1, x1Is 10000 yuan, xmeanThe average of three customer revenues is 15000 yuan, xstdThe standard deviation of three customer incomes, 7071 yuan, gives x as-0.71; client 2, client 3, and so on.
And performing data cleaning on the characteristic data through the steps to supplement or correct errors existing in the data to be processed and ensure data consistency.
In this embodiment, the first object data set is input into a random forest model, wherein the first object data set includes basic features, financial behavior features and portrait description features of the first object. For example, basic information of the customer, including the customer's name/home address level/age/sex/school calendar/work type, etc.; the second part is the behavior information of the customer, including the last 10 browsing behaviors of the customer/the last 3 financial product purchasing behaviors/the last large fund transfer timestamp/the last large fund transfer amount (more than 5 w); the third part is the portrait description of the client, including the financial status/risk preference/risk tolerance/investment experience/investment style, etc., for example, feature data of 36 latitudes in total, then the present embodiment needs to select 6 features (i.e. the above-mentioned common features) from the feature data of 36 dimensions, which have the largest influence on the result.
Specifically, the random forest model in this embodiment is an algorithm that integrates a plurality of trees by the idea of Ensemble Learning, its basic unit is a decision tree, and its essence belongs to a large branch of machine Learning, namely, the Ensemble Learning (Ensemble Learning) method. The random forest algorithm obtains a training model after training the feature data, the training model can provide an interface, n features with the largest influence result in the training process can be output through the interface, for example, n is 6, and finally obtained 6 common features are financial status/risk preference/browsing behavior/last 3 financial product purchasing behaviors/last large fund transfer amount/home address level.
Step S206, training a target classification model by using the first object data set, the second object data set and the common characteristics to obtain a similar population expansion model;
in an alternative embodiment of the present disclosure, a seed object dataset and a non-seed object dataset in a first object dataset are identified using a commonality feature; and training a target classification model by taking the seed object data set and the non-seed object data set as positive samples and the second object data set as negative samples.
In this embodiment, due to some high-net-value customers (i.e., the seed object data sets), some financial products are often purchased, and the purchase/frequency is related to the wealth capacity; however, there are some customers who are not high net value (i.e., the non-seed object data set), who will collect a certain amount of money and buy a certain financial product, which are good customers with strong viscosity, and there is a certain commonality, but there is still a difference in some characteristics, which need to be considered separately.
Specifically, clustering the first object by using the common characteristic and the clustering model to divide the first object data set into a seed object data set and a non-seed object data set; the probability value of the first object associated target resource management event in the seed object data set is greater than or equal to a second preset value, and the probability value of the first object associated target resource management event in the non-seed object data set is smaller than the second preset value.
For example, in the target database, a more viscous customer (i.e. the first object, which may buy a financial product) is selected, and then a second object data set, which is similar to the first object data set, is selected from the remaining objects, the number of objects in the second object data set is the same as the number of the first objects, and by default, no object in the second object data set will buy a financial product, and a common feature exists between the object in the second object data set and the first object.
For example, 1 ten thousand seed customers' data are extracted from the target database, and another 1 ten thousand customers except for the seed customer are randomly extracted from the target database to serve as the negative category of the training sample, while the seed customer serves as the positive category of the training sample. The population diffusion by using the classification model comprises the following steps: the seed user is a positive sample (purchased), the candidate objects are negative samples (randomly selected, trained by adding the model with the characteristics of the users with high net value and non-high net value in the positive sample), the classification model is trained, and then all the candidate objects are screened by using the classification model.
Preferably, the classification model is a support vector machine model.
After the first object data set is divided into a seed object data set and a non-seed object data set, the objects are labeled with labels label.
Step S208, predicting the probability value of the object association target resource management event which is in the target database, except the first object and the second object and has the common characteristic through the similar crowd extension model;
in this embodiment, the relevance between the objects is learned through the similar population expansion model, so that the similar population expansion model learns to judge whether the customer purchases the product according to the historical data.
Step S210, recommending resource information associated with the target resource management event to the target object with the probability value greater than or equal to the first preset value.
In one example, the used algorithm model is an SVM (Support Vector Machine), after feature data is input into the model for training, the trained model (i.e., the similar population expansion model) predicts a target population, outputs a probability of purchasing a certain product, sorts the probability according to the probability from high to low, and finally obtains N potential customers (i.e., the target objects) as potential customers of purchasing a certain product, and pushes associated resource information to the target objects.
After the step S210, the method further includes: selecting a third object related to the target resource management event from the target objects; and if the number of the third objects is smaller than a third preset value, adding a third object data set corresponding to the third objects into the first object data set, and re-screening the common characteristics between the first objects and other first objects from the updated first object data set by using a random forest model.
In another embodiment, if the number of the third objects is greater than or equal to a third preset value, which indicates that the prediction effect of the similar population expansion model is good, the operation is ended, and the prediction result is output.
The information recommendation method provided by the embodiment of the invention selects a first object data set containing a plurality of first objects related to the target resource management event and at least one characteristic data corresponding to each first object from a target database, and selects a second object data set which is similar to the first object data set except the first object data set, wherein the second object in the second object data set is not related to the target resource management event; screening out the common characteristics between the first object and other first objects from the first object data set by using a random forest model, and better excavating the common characteristics among the user characteristics; training a random forest model and a classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model; predicting probability values of the objects which are in the target database, except the first object and the second object and have common characteristics, related to the target resource management events through a similar crowd extension model; resource information related to a target resource management event is recommended to a target object with a probability value larger than or equal to a first preset value, more potential clients with commonality are expanded under the condition of less data volume by using a similar population expansion model, personalized products are accurately pushed to the expanded potential clients, and user requirements are met, so that the technical problem that the user requirements cannot be accurately identified due to the fact that the information recommendation and the client requirement matching degree are low in the related technology is solved.
The following describes the present invention with reference to a workflow of information recommendation of a specific embodiment:
fig. 3 is a flowchart of an information recommendation operation according to an embodiment of the present invention, as shown in fig. 3, including the following steps:
step S301, the data source, namely the first object data set, is a user with high viscosity to bank financing products. Sticky users refer to customers who have purchased the present financial product and purchased other financial products. If a customer purchases a certain financial product, some basic information and financial behavior information of the customer have certain relevance with the customer who has purchased the product, and the behavior of purchasing a plurality of times shows that the financial product is enthusiastic, and through information mining of the batch of customers, a batch of other customers with key information (namely the common characteristic) similar to the batch of customers, namely potential customers who may purchase the financial product, can be found and accurately promoted.
Step S302, cleaning a data source;
step S303, selecting important features; putting the data characteristics cleaned by the data into a random forest model for training, and finally selecting the data characteristics if the original data has k characteristics
Figure BDA0003324083540000121
The individual characteristics are used as output;
s304, clustering by K-means; clustering the output in step S303 by using a clustering algorithm, distinguishing seed users and labeling with label, where the category can be selected according to the actual situation, and the number of the selected categories is 2 in this embodiment, that is, the seed object data set and the non-seed object data set.
Step S305, an algorithm model; taking the seed object data set and the non-seed object data set as positive examples (namely positive samples) of the model, randomly extracting users with similar data volumes from a target database as negative examples (namely negative samples), and putting the negative examples into a support vector machine model for training to obtain a similar population expansion model;
step S306, outputting a result; and judging the total number of users in the database by the similar population expansion model to predict and outputting the probability of a positive example.
Step S307, effect judgment; sorting according to the probability, and selecting the first m users as popularization objects (namely the target objects); if the effect is significant, step S308 is executed to stabilize the output.
If the effect is not good, step S309 is executed, whether the user purchases the financial product is used as an index to return the model effect, and step S303 is returned to adjust the characteristic selection and clustering model parameters. For example, a financial product is promoted to a target customer, if the customer purchases the product, the prediction is accurate, and if the customer does not purchase the product, the prediction result is still to be improved; further, the user who has purchased the product is added to the positive sample again, and training is performed again for the added negative sample that has not been purchased. Some parameters (over-fitting/under-fitting) during training are adjusted according to the actual feedback result, so that the prediction model can excavate the association system as much as possible, and the prediction accuracy is improved.
Based on the information recommendation methods provided in the foregoing embodiments, based on the same inventive concept, the present embodiment further provides an information recommendation apparatus, which is used to implement the foregoing embodiments and preferred embodiments, and which has been described and is not repeated. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a block diagram of an information recommendation apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus includes: a first selecting module 40, configured to select a first object data set including a plurality of first objects associated with the target asset management event and at least one feature data corresponding to each of the first objects from the target database, and select a second object data set that is similar to the first object data set except the first object data set and is not associated with the target asset management event; a first selecting module 42, connected to the first selecting module 40, for selecting common features between the first object and other first objects from the first object data set by using a random forest model; a training module 44, connected to the first screening module 42, for training a target classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model; a prediction module 46, connected to the training module 44, for predicting probability values of object-related target resource management events, which are in the target database, except the first object and the second object and have common characteristics, through a similar population expansion model; and a recommending module 48, connected to the predicting module 46, for recommending the resource information associated with the target resource management event to the target object with the probability value greater than or equal to the first preset value.
Optionally, the characteristic data at least comprises a basic characteristic, a financial behavior characteristic and a portrait description characteristic.
Optionally, before the first filtering module 42 filters out the common features between the first object and the other first objects from the first object data set by using the random forest model, the apparatus further includes: the supplementing module is used for supplementing at least one feature data lacking the feature value in the first object data set with the corresponding feature data; the encoding module is used for carrying out one-hot encoding on discrete characteristic values corresponding to at least one characteristic data in the first object data set; and the normalization module is used for normalizing the characteristic value corresponding to at least one characteristic data in the first object data set.
Optionally, the training module 44 includes: an identification unit for identifying a seed object data set and a non-seed object data set in the first object data set using the common features; and the training unit is used for training the target classification model by taking the seed object data set and the non-seed object data set as positive samples and the second object data set as negative samples.
Optionally, the identification unit includes: a clustering subunit, configured to cluster the first object by using the commonality characteristics and the clustering model, so as to divide the first object data set into a seed object data set and a non-seed object data set; the probability value of the first object associated target resource management event in the seed object data set is greater than or equal to a second preset value, and the probability value of the first object associated target resource management event in the non-seed object data set is smaller than the second preset value.
Optionally, the apparatus further comprises: the second selecting module is used for selecting a third object related to the target resource management event from the target objects after the recommending module 48 recommends the resource information related to the target resource management event to the target objects with the probability values larger than or equal to the first preset value; and the second screening module is used for adding a third object data set corresponding to the third object to the first object data set if the number of the third objects is smaller than a third preset value, and re-screening the common characteristics between the first object and other first objects from the updated first object data set by using the random forest model.
Optionally, the target classification model is a support vector machine model.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, selecting a first object data set containing a plurality of first objects related to target resource management events and at least one characteristic data corresponding to each first object from a target database, and selecting a second object data set which is similar to the first object data set except the first object data set, wherein the second object in the second object data set is not related to the target resource management events;
s2, screening out common characteristics between the first object and other first objects from the first object data set by using a random forest model;
s3, training a target classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model;
s4, predicting probability values of the objects, except the first object and the second object, in the target database and having the common characteristics related to the target resource management events through the similar population expansion model;
and S5, recommending resource information associated with the target resource management event to the target object with the probability value larger than or equal to the first preset value.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Based on the above embodiments of the method shown in fig. 2 and the apparatus shown in fig. 4, in order to achieve the above object, an embodiment of the present invention further provides an electronic device, as shown in fig. 5, including a memory 52 and a processor 51, where the memory 52 and the processor 51 are both disposed on a bus 53, the memory 52 stores a computer program, and the processor 51 implements the information recommendation method shown in fig. 2 when executing the computer program.
Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a memory (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling an electronic device (which can be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present invention.
Optionally, the device may also be connected to an object interface, a network interface, a camera, Radio Frequency (RF) circuitry, a sensor, audio circuitry, a WI-FI module, and so forth. The object interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the selectable object interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., a bluetooth interface, WI-FI interface), etc.
It will be understood by those skilled in the art that the structure of an electronic device provided in the present embodiment does not constitute a limitation of the physical device, and may include more or less components, or some components in combination, or a different arrangement of components.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An information recommendation method, comprising:
selecting a first object data set containing a plurality of first objects associated with a target resource management event and at least one characteristic data corresponding to each first object from a target database, and selecting a second object data set which is similar to the first object data set except the first object data set, wherein a second object in the second object data set is not associated with the target resource management event;
screening out common characteristics between the first object and other first objects from the first object data set by using a random forest model;
training a target classification model by using the first object data set, the second object data set and the common features to obtain a similar population expansion model;
predicting, by the similar population expansion model, a probability value that an object in the target database, other than the first object and the second object, having the common characteristic is associated with the target resource management event;
and recommending resource information associated with the target resource management event to the target object with the probability value greater than or equal to a first preset value.
2. The method of claim 1, wherein the feature data includes at least a basic feature, a financial behavior feature, and a portrait description feature.
3. The method of claim 1, prior to screening out commonality characteristics between the first object and other first objects from the first object data set using a random forest model, the method further comprising:
supplementing at least one feature data lacking a feature value in the first object data set with corresponding feature data;
carrying out one-hot coding on discrete characteristic values corresponding to at least one characteristic data in the first object data set;
and normalizing the characteristic value corresponding to at least one characteristic data in the first object data set.
4. The method of claim 1, wherein training a target classification model using the first object data set, the second object data set, and the commonality features, resulting in a similar population extension model comprises:
identifying a seed object dataset and a non-seed object dataset in the first object dataset using the common features;
and training the target classification model by taking the seed object data set and the non-seed object data set as positive samples and the second object data set as negative samples.
5. The method of claim 4, wherein the identifying a seed object dataset and a non-seed object dataset in the first object dataset using the commonality features comprises:
clustering the first object using the commonality features and a clustering model to divide the first object data set into a seed object data set and a non-seed object data set;
wherein a probability value of the first object in the seed object data set being associated with the target resource management event is greater than or equal to a second preset value, and a probability value of the first object in the non-seed object data set being associated with the target resource management event is less than the second preset value.
6. The method of claim 1, wherein after recommending resource information associated with the target resource management event to the target object having the probability value greater than or equal to a first preset value, the method further comprises:
selecting a third object associated with the target resource management event from the target objects;
and if the number of the third objects is smaller than a third preset value, adding a third object data set corresponding to the third objects into the first object data set, and re-screening out common characteristics between the first objects and other first objects from the updated first object data set by using the random forest model.
7. The method according to any one of claims 1-6, wherein the target classification model is a support vector machine model.
8. An information recommendation apparatus, comprising:
a first selecting module, configured to select, from a target database, a first object data set including first objects associated with a plurality of target resource management events and at least one feature data corresponding to each of the first objects, and select a second object data set that is similar to the first object data set except for the first object data set and is not associated with the target resource management events;
a first screening module, configured to screen out common features between the first object and other first objects from the first object data set by using a random forest model;
the training module is used for training a target classification model by utilizing the first object data set, the second object data set and the common characteristics to obtain a similar population expansion model;
a prediction module, configured to predict, through the similar population expansion model, a probability value of an object, other than the first object and the second object, in the target database, and having the common characteristic, being associated with the target resource management event;
and the recommending module is used for recommending the resource information associated with the target resource management event to the target object with the probability value larger than or equal to a first preset value.
9. An electronic device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, realizing the steps of the method of any one of claims 1 to 7.
CN202111256727.2A 2021-10-27 2021-10-27 Information recommendation method and device, electronic equipment and storage medium Pending CN114117247A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111256727.2A CN114117247A (en) 2021-10-27 2021-10-27 Information recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111256727.2A CN114117247A (en) 2021-10-27 2021-10-27 Information recommendation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114117247A true CN114117247A (en) 2022-03-01

Family

ID=80377199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111256727.2A Pending CN114117247A (en) 2021-10-27 2021-10-27 Information recommendation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114117247A (en)

Similar Documents

Publication Publication Date Title
CN110837931B (en) Customer churn prediction method, device and storage medium
CN108090800B (en) Game prop pushing method and device based on player consumption potential
CN111861768B (en) Service processing method and device based on artificial intelligence, computer equipment and medium
CN105225135B (en) Potential customer identification method and device
CN112750011A (en) Commodity recommendation method and device and electronic equipment
CN113095408A (en) Risk determination method and device and server
CN111489201A (en) Method, device and storage medium for analyzing customer value
CN112785441B (en) Data processing method, device, terminal equipment and storage medium
CN114371946B (en) Information push method and information push server based on cloud computing and big data
CN111882420A (en) Generation method of response rate, marketing method, model training method and device
CN107909087A (en) Generate the method and system of the assemblage characteristic of machine learning sample
CN111582932A (en) Inter-scene information pushing method and device, computer equipment and storage medium
CN110852785A (en) User grading method, device and computer readable storage medium
CN113590678A (en) Portrait analysis method based on internet finance and big data analysis server
CN114612194A (en) Product recommendation method and device, electronic equipment and storage medium
CN112529319A (en) Grading method and device based on multi-dimensional features, computer equipment and storage medium
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN116127184A (en) Product recommendation method and device, nonvolatile storage medium and electronic equipment
CN115630221A (en) Terminal application interface display data processing method and device and computer equipment
CN113392920B (en) Method, apparatus, device, medium, and program product for generating cheating prediction model
Vaganov et al. Forecasting purchase categories with transition graphs using financial and social data
CN112395499B (en) Information recommendation method and device, electronic equipment and storage medium
US11568177B2 (en) Sequential data analysis apparatus and program
CN114117247A (en) Information recommendation method and device, electronic equipment and storage medium
CN113536111B (en) Recommendation method and device for insurance knowledge content and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination