CN113244629B - Recall method and device for lost account, storage medium and electronic equipment - Google Patents

Recall method and device for lost account, storage medium and electronic equipment Download PDF

Info

Publication number
CN113244629B
CN113244629B CN202110693747.XA CN202110693747A CN113244629B CN 113244629 B CN113244629 B CN 113244629B CN 202110693747 A CN202110693747 A CN 202110693747A CN 113244629 B CN113244629 B CN 113244629B
Authority
CN
China
Prior art keywords
account
loss
recall
subset
lost
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110693747.XA
Other languages
Chinese (zh)
Other versions
CN113244629A (en
Inventor
陶冶
刘阳
徐广根
刘妍
万志远
叶沐芊
邹丰富
江鑫
李鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110693747.XA priority Critical patent/CN113244629B/en
Publication of CN113244629A publication Critical patent/CN113244629A/en
Application granted granted Critical
Publication of CN113244629B publication Critical patent/CN113244629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • A63F13/79Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a recall method and device for lost accounts, a storage medium and electronic equipment. Wherein the method comprises the following steps: acquiring a to-be-identified loss account set and a corresponding loss account feature set; according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of predetermined clustering clusters, and a first lost account subset in the lost account set is obtained; inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set; and determining a third fluid loss account subset to be recalled according to the first fluid loss account subset and the second fluid loss account subset. The invention solves the technical problem of lower recall rate of the lost account.

Description

Recall method and device for lost account, storage medium and electronic equipment
Technical Field
The present invention relates to the field of computers, and in particular, to a recall method and apparatus for a lost account, a storage medium, and an electronic device.
Background
In the related art, account loss problems exist in application clients or applets, for example, account loss problems exist in game clients, short video clients and business clients. Recall of the attrition account is significant to ensure the continued development of the application client or applet.
Most of existing recall schemes of the lost accounts are used for preparing uniform recall strategies for all lost accounts and carrying out recall by matching with some operation strategies. However, the recall scheme in the prior art lacks individuation, the possible recall accounts can not be accurately positioned, the recall operation is carried out on some of the accounts which are not possible to recall, the recall rate of the lost accounts is low, and the operation cost is wasted.
Aiming at the problem of lower recall rate of the lost account in the related technology, no effective solution exists at present.
Disclosure of Invention
The embodiment of the invention provides a recall method and device for a lost account, a storage medium and electronic equipment, which are used for at least solving the technical problem of low recall rate of the lost account.
According to one aspect of the embodiment of the invention, there is provided a recall method for a loss account, including: acquiring a to-be-identified loss account set and a corresponding loss account feature set, wherein the loss account set comprises accounts currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set; according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of preset clustering clusters to obtain a first lost account subset in the lost account set, wherein the clustering cluster to which the first lost account subset belongs is a target clustering cluster, the recall probability of which meets a first preset condition, in the plurality of clustering clusters, and the plurality of features of the lost accounts in the first lost account subset meet the clustering condition corresponding to the target clustering cluster; inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model; and determining a third current loss account subset to be recalled according to the first current loss account subset and the second current loss account subset.
According to another aspect of the embodiment of the present invention, there is also provided a recall device for a loss account, including: the system comprises an acquisition module, a verification module and a verification module, wherein the acquisition module is used for acquiring a to-be-identified loss account set and a corresponding loss account feature set, the loss account set comprises accounts which are currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set; the execution module is used for executing clustering operation on the loss account set in a plurality of preset clusters according to the loss account feature set to obtain a first loss account subset in the loss account set, wherein the cluster to which the first loss account subset belongs is a target cluster of which recall probability in the plurality of clusters meets a first preset condition, and a plurality of features of the loss account in the first loss account subset meet the clustering condition corresponding to the target cluster; the input module is used for inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model; and the determining module is used for determining a third fluid loss account subset to be recalled according to the first fluid loss account subset and the second fluid loss account subset.
According to yet another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the recall method of the above-described attrition account when running.
According to yet another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, and a processor, where the memory stores a computer program, and the processor is configured to execute the recall method of the attrition account described above by using the computer program.
In the embodiment of the invention, a clustering operation is carried out on the lost account set by adopting a processing mode of clustering and a neural network model, the lost account feature set is processed by using a target neural network model, and a third lost account subset to be recalled is determined according to a first lost account subset obtained by the clustering operation and a second lost account subset obtained by the target neural network model. The aim of accurately recalling the lost account is achieved, so that the technical effect of improving the recall rate of the lost account is achieved, and the technical problem that the recall rate of the lost account is low is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a schematic illustration of an application environment of an alternative recall method for a attrition account in accordance with an embodiment of the present invention;
FIG. 2 is a flow chart of an alternative recall method for a attrition account in accordance with an embodiment of the present invention;
FIG. 3 is a training schematic of an alternative LightGBM model according to an embodiment of the invention;
FIG. 4 is a schematic diagram of an alternative deep FM model structure according to an embodiment of the present invention;
FIG. 5 is an alternative Stacking fusion schematic according to an embodiment of the invention;
FIG. 6 is a schematic diagram of an alternative plurality of clusters in accordance with an embodiment of the present invention;
FIG. 7 is an alternative portrait radar pictorial view according to an embodiment of the present invention;
FIG. 8 is a schematic illustration of an alternative recall flow according to an embodiment of the invention;
FIG. 9 is a schematic diagram of an alternative recall device for a attrition account in accordance with an embodiment of the present invention;
fig. 10 is a schematic structural view of an alternative electronic device according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiment of the present invention, a method for recalling a loss account is provided, optionally, as an optional implementation manner, the above-mentioned method for recalling a loss account may be applied, but is not limited to, in a system environment as shown in fig. 1. The system environment includes: terminal equipment 102, network 110, server 112.
Alternatively, in this embodiment, the terminal device may be configured with a target application, and the terminal device may include, but is not limited to, at least one of the following: a mobile phone (e.g., an Android mobile phone, iOS mobile phone, etc.), a notebook computer, a tablet computer, a palm computer, a MID (Mobile Internet Devices, mobile internet device), a PAD, a desktop computer, a smart television, etc. The target application may be a game application client, a video application client, an instant messaging application client, a browser application client, an educational application client, and the like. The terminal device includes, but is not limited to, a memory 104 for storing a collection of attrition accounts and a corresponding collection of attrition account features, a processor 106, and a display 108. The processor is used for processing the loss account number set and the corresponding loss account number feature set. The display may be configured to display the attrition accounts in the collection of attrition accounts.
The network 110 may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: local area networks, metropolitan area networks, and wide area networks, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communications.
The server 112 may be a single server, a server cluster including a plurality of servers, or a cloud server. The server includes a database 114 for storing data, including but not limited to, a collection of attrition accounts and a corresponding collection of attrition account features, and a processing engine 116. The processing engine is configured to process data, including but not limited to performing a clustering operation on the loss account set in a plurality of predetermined clusters according to the feature set of the loss account, so as to obtain a first subset of the loss accounts in the loss account set; inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set; and determining a third current loss account subset to be recalled according to the first current loss account subset and the second current loss account subset.
The above is merely an example, and is not limited in any way in the present embodiment.
Optionally, as an optional implementation manner, as shown in fig. 2, the recall method of the loss account includes:
step S202, acquiring a to-be-identified loss account set and a corresponding loss account feature set, wherein the loss account set comprises accounts currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set;
the above-mentioned loss account set includes a plurality of loss accounts, for example, may include loss accounts of 100, 1000, 5000, etc. The attrition account is an account currently in an attrition state in the target application, where the attrition state includes, but is not limited to, an account that does not log in the target for more than a predetermined period of time, and the preset period of time may be determined according to practical situations, for example, a week, a month, 45 days, 5 hours, and the like. Taking the above target application as a game application client as an example, the attrition account may be a player that has not logged into the game client for more than one week. The loss account feature set may include a plurality of feature groups, each of which may include a plurality of features. The specific features included in the loss account feature set may be determined according to practical situations, for example, the features of accounts in different application clients may be different, and the loss account feature set may be set according to the features of accounts in the clients. Taking the example that the target application is a game client, the loss account feature set may include: basic attribute feature sets, active feature sets, payment feature sets, play feature sets, intra-game social feature sets, and the like. The basic attribute feature set may include: combat power, account number level, VIP level, etc. The active feature set may comprise: the last active login days of a month, the online time, the average online time, the day/night time duty ratio and the like of the player; the payment feature set may comprise: historical payment amount, last active payment amount for one month, etc.; the set of play features may include: the ratio of various playing methods is participated in historically; in-game social feature group: chat times with others, application friend times, participation in team war times, and the like.
Step S204, according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of predetermined clusters to obtain a first lost account subset in the lost account set, wherein the cluster to which the first lost account subset belongs is a target cluster of which recall probability in the plurality of clusters meets a first preset condition, and a plurality of features of the lost account in the first lost account subset meet a clustering condition corresponding to the target cluster;
wherein, the predetermined plurality of clusters may be obtained by a clustering algorithm, and the clustering algorithm includes, but is not limited to, a k-means clustering algorithm, a kmeans++ clustering algorithm, and the like. Each cluster satisfies a corresponding cluster condition, which may be one or several features. For example, a attrition account in a cluster of the plurality of clusters satisfies that the online time exceeds a preset duration and the historical payment amount exceeds a preset amount (clustering condition). The target cluster may be a cluster with a larger recall probability, for example, an account with a shorter loss time (less than a preset duration) and more historical payment (greater than a preset further) is easier to recall, and the lost accounts in the first current account loss subset satisfy a cluster condition with a shorter loss time and more historical payment.
Step S206, inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model;
the target neural network model may be a Stacking fusion neural network model, may include a LightGBM, deepFM model, may use a training sample to train the initial model, may use a LightGBM, deepFM model after training to obtain a attrition account belonging to the recall account category, and the second attrition account subset includes a plurality of attrition accounts belonging to the recall account category in the attrition account set.
Step S208, determining a third subset of the lost accounts to be recalled according to the first subset of the lost accounts and the second subset of the lost accounts.
Wherein the attrition accounts in the first subset of attrition accounts and the second subset of attrition accounts are attrition accounts with greater recall probability. Duplicate accounts may be present in the first subset of the loss accounts and the second subset of the loss accounts, and the third subset of the loss accounts may be obtained by deduplicating the first subset of the loss accounts and the second subset of the loss accounts.
Through the steps, a clustering mode and a neural network model processing mode are adopted, clustering operation is carried out on the lost account set, the target neural network model is used for processing the lost account feature set, and a third lost account subset to be recalled is determined according to the first lost account subset obtained through the clustering operation and the second lost account subset obtained through the target neural network model. The aim of accurately recalling the lost account is achieved, so that the technical effect of improving the recall rate of the lost account is achieved, and the technical problem that the recall rate of the lost account is low is solved.
Optionally, the inputting the feature set of the attrition account into a target neural network model to obtain a second subset of the attrition accounts in the attrition account set includes: inputting the loss account feature set into a first prediction neural network model in the target neural network model to obtain a first probability set predicted by the first prediction neural network model, wherein the first probability set comprises the probability that each loss account in the loss account set belongs to the recall account class, and the first prediction neural network model is used for determining the probability that each loss account belongs to the recall account class according to the loss account feature set; inputting the loss account feature set and the first probability set into a second prediction neural network model in the target neural network model to obtain a second probability set predicted by the second prediction neural network model, wherein the second probability set comprises the probability that each loss account in the loss account set belongs to the recall account category, and the second prediction neural network model is used for determining 2-order cross features and higher than 2-order cross features according to the loss account feature set and the first probability set, and determining the probability that each loss account belongs to the recall account category according to the 2-order cross features and the higher than 2-order cross features; and determining the second loss account subset in the loss account set according to the first probability set and the second probability set.
As an alternative embodiment, the first prediction neural network model may be a LightGBM classification model. LightGBM (Light Gradient Boosting Machine) is a framework for realizing GBDT (Gradient Boosting Decision Tree) algorithm, and the main idea of GBDT is to use weak classifier (decision tree) to train iteratively to obtain an optimal model, and the model has the advantages of good training effect, difficult fitting and the like. LightGBM is a distributed gradient promotion framework based on decision tree algorithms. The design idea of LightGBM is mainly two points: the use of data to the memory is reduced, and the single machine can use more data as much as possible under the condition of not sacrificing the speed; the cost of communication is reduced, the efficiency of multi-machine parallel operation is improved, and the linear acceleration in calculation is realized. It can be seen that the design of the LightGBM is initially to provide a data science tool that is fast and efficient, has low memory footprint, high accuracy, and supports parallel and large-scale data processing.
As an optional implementation manner, training sample data may be used to train the initial LightGBM model, the trained LightGBM model may identify the attrition accounts belonging to the recall account category, the set of attrition accounts to be identified is input into the trained LightGBM model, an output result of the LightGBM model may be obtained, the output result includes a probability that each attrition account in the set of attrition accounts belongs to the recall account category, and a larger probability indicates a larger probability that the corresponding attrition account can be recalled. As shown in fig. 3, a training schematic diagram of the LightGBM model according to an alternative embodiment of the present invention is shown, taking training sample data as an example of a loss sample account set, and account features of the loss sample account are input, including but not limited to basic attribute features, active class features, payment class features, play class features, and social class features in the figure. The LightGBM model may output the probability that each attrition sample account belongs to a positive sample, which refers to a recall sample. The probability of the lost sample account being recalled is the preset convergence condition is met between the probability of the positive sample output by the LightGBM model and the probability of the positive sample with the known lost sample account in the training process, and training is stopped to obtain the training completed LightGBM model.
As an alternative embodiment, the second prediction neural network model may be a deep FM model, and the deep FM model may learn the cross feature by using a form of a dot product and a hidden vector. Because of complexity constraints, FM typically applies the 2-fold cross feature of order-2. Deep models are good at capturing high-order complex features. Deep FM is an algorithm derived from the basis of FM, deep is combined with FM, FM is used for inter-feature low-order combination, deep part can be used for realizing inter-feature high-order combination, and the two combination modes are performed in parallel in Deep FM. FIG. 4 is a schematic diagram of a deep FM model in which fields may represent a set of characteristics and the probability of each attrition account included in the first set of probabilities output by the first predictive neural network model belonging to the class of recall accounts in accordance with an alternative embodiment of the invention. For example, field i may represent a basic set of attribute features, with each node in Field i representing a feature included in the basic set of attribute features: combat power, account number level, VIP level, etc. Fieldj may represent an active feature set and each node in Fieldj represents a feature included in the active feature set: the last active login day of a month, the online time, the average online time, the day/night time duty ratio and the like of the player. Fieldm may represent the probability that the attrition account belongs to the recall account category. Each feature is available to its corresponding Dense casting through a look up operation.
Figure BDA0003127211670000091
/>
W, V in the above formula i ,V j The model is required to learn, x represents the model input feature vector, and the model input feature vector comprises the loss account feature set and the first probability set. X is x j1 ,x j2 Respectively represent the j th 1 ,j 2 Feature values, < w, x > represents extracting first order features,
Figure BDA0003127211670000092
representation extraction twoStep crossing features.
a (0) ={e 1 ,e 2 ,e 3 ,…,e m }
a (l+1) =σ(W (l) a (l) +b (l) )
y DNN =σ(W |H|+1 a H +b |H|+1 )
In the above, e i Embedding vector, alpha, representing the i-th Field (0) An input value of deep neuralnetwork, W (l) ,a (l) ,b (l) Respectively representing the weight, output and bias item parameters of the layer-I model, |H| represents the number of hidden layers of the network, and sigma represents the activation function.
Figure BDA0003127211670000101
The deep fm model is an end-to-end model that can be extracted from the original features to various complexity features, and has mainly the following two advantages: the deep FM model comprises an FM part and a DNN part, the FM model can extract low-order features, feature crossing is automatically realized, and the DNN can extract high-order features. The deep FM model trains very quickly since the input is only the original feature and the FM and DNN share the input vector features.
As an alternative implementation mode, the FM layer applies 2-fold cross features of order-2 to determine cross features of 2 orders, hidden layer realizes high-order combination among features to determine cross features higher than 2 orders, and Output Units can obtain the probability that each loss account in the loss account feature set belongs to the recall account category according to the cross features of 2 orders determined by the FM layer and the cross features higher than 2 orders determined by the Hidden layer, wherein the larger the probability is, the larger the probability is representing the probability that the corresponding loss account can be recalled.
As an optional implementation manner, stacking fusion is performed on the LightGBM and deep fm models, as shown in fig. 5, which is a Stacking fusion schematic diagram according to an optional embodiment of the present invention, a loss account feature set is input into the LightGBM model to obtain a first probability set input into the LightGBM model for output, the loss account feature set and the first probability set are input into the deep fm to obtain a second probability set output by the deep fm, where the second probability set includes a plurality of probability values, and each probability value is used to represent a probability that each account belongs to a recall account category.
As an optional implementation manner, taking the average value of the LightGBM output probability and the deep fm output probability as the probability that each loss account belongs to the recall account category, and obtaining a final second loss account subset by setting a proper threshold, wherein the prediction result in the figure comprises the second loss account subset. The probability average for each attrition account in the second subset of attrition accounts is greater than the threshold. The threshold may be determined according to practical situations, and assuming that the threshold is 0.5, the attrition account with the average value of the LightGBM output probability and the deep fm output probability being greater than 0.5 belongs to the recall account category, and is the account in the second subset of the attrition accounts.
In the embodiment, the probability that each loss account outputted by the LightGBM model belongs to the recall account category is used as the input of the deep fm model through the Stacking fusion of the LightGBM model and the deep fm model, so that the fitting capacity of the model can be enhanced, and the accuracy of model identification can be improved.
Optionally, the determining the second subset of the loss accounts in the loss account set according to the first probability set and the second probability set includes: determining the final prediction probability of each loss account in the loss account set belonging to the recall account category according to the first probability set and the second probability set, wherein the final prediction probability of each loss account is the average value of probabilities corresponding to each loss account in the first probability set and the second probability set; and searching the loss account number with the final prediction probability larger than a preset threshold value in the loss account number set to obtain the second loss account number subset.
As an optional implementation manner, the final prediction probability is a probability average value corresponding to each loss account in the first probability set and the second probability set. Assuming that the probability value of the loss account a belonging to the recall account category in the first probability set is 0.6, the probability value of the loss account a belonging to the recall account category in the second probability set is 0.5. Calculating the average value of 0.55 of 0.6 and 0.5 as the final prediction probability that the loss account A belongs to the recall account category. Assuming that the probability value of the loss account B belonging to the recall account category in the first probability set is 0.3, and the probability value of the loss account a belonging to the recall account category in the second probability set is 0.4. And calculating the average value of 0.35 of 0.3 and 0.4 to be the final prediction probability that the loss account B belongs to the recall account class. The preset threshold may be determined according to practical situations, and if the preset threshold is 0.5, because the final prediction probability of the account a is 0.55, the account a belongs to the second subset of the loss accounts. And the final prediction probability of the loss account number B is 0.35, and the account number B does not belong to the second loss account number subset.
Optionally, the method further comprises: obtaining a corresponding lost sample account feature set and a corresponding actual labeling result set of a lost sample account set, wherein the lost sample account set comprises accounts which are in a lost state in a preset first time period in the target application, the lost sample account feature set comprises a plurality of features of each lost account in the lost sample account set, each labeling result in the actual labeling result set represents whether the corresponding lost account in the lost sample account set actually becomes a recall account in a preset second time period, and the second time period is later than the first time period; training a first neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first neural network model to be trained and the actual labeling result set meets a second preset condition, so as to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
As an optional implementation manner, taking the first prediction neural network model as the LightGBM model as an example, the LightGBM model is obtained by training a lost sample account feature set and an actual labeling result set. The loss sample account feature set includes a plurality of features of each loss sample account in the loss sample account set. The loss sample account set is used as a training sample, and is a loss sample account in a historical time, for example, an account which is not logged in a game client for one week is used as a loss sample account, and an actual labeling result is obtained according to whether the loss sample account is recalled or not and the loss sample account is labeled.
As an alternative embodiment, assume that the 100 accounts that were not logged into the target application for 20201101-20201107 (first time period) are attrition sample accounts, and count whether the target application is re-targeted at 20201108-20201208 (second time period) in these attrition sample accounts. The loss sample account of the re-login target application becomes the recall account. Recalled and non-recalled accounts may be noted using 1 and 0. For example, at 20201108-20201208 (second time period) the recall account of the re-logged-in target application is labeled with 1 and the attrition sample account of the non-re-logged-in target application is labeled with 0. Training the LightGBM model by using the loss sample account in the loss sample account set and the actual labeling result set, and when training the LightGBM model, stopping training to obtain the trained LightGBM model if a loss function between a predicted labeling result of the loss sample account output by the LightGBM model and an actual labeling result of the loss sample account meets a convergence condition, wherein the predicted labeling result can be used for representing the probability that the loss sample account becomes a recall account at 20201108-20201208 (a second time period).
Optionally, the method further comprises: training a second neural network model to be trained by using the lost sample account feature set, the actual labeling result set and the first prediction labeling result set until a loss function between the second prediction labeling result set output by the second neural network model to be trained and the actual labeling result set meets a third preset condition, so as to obtain the second prediction neural network model, wherein each labeling result in the second prediction labeling result set represents the probability that a corresponding lost account in the predicted lost sample account set becomes a recall account in the second time period.
As an optional implementation manner, taking the second prediction neural network model as the deep fm model, where the deep fm model is obtained by training using the feature set of the lost sample account, the actual labeling result set, and the prediction labeling result output by the LightGBM model, when the deep fm model is trained, if a loss function between the prediction labeling result of the lost sample account output by the deep fm model and the actual labeling result of the lost sample account meets a convergence condition, stopping training to obtain a trained deep fm model, where the prediction labeling result may be used to represent a probability that the lost sample account becomes a recall account at 20201108-20201208 (a second time period).
Optionally, the method further comprises: obtaining a recall sample account feature set corresponding to the recall sample account set, wherein each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, the recall sample account feature set comprises a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period; determining a group of key features and feature values of the group of key features according to the recall sample account feature set; and clustering lost accounts in a lost sample account set by using the set of key features and the feature values of the set of key features to obtain the plurality of clustering clusters and clustering conditions corresponding to each clustering cluster, wherein the clustering conditions corresponding to each clustering cluster comprise one or more key features and the corresponding feature values in the set of key features, the lost sample account set comprises the recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
As an alternative embodiment, the recall sample accounts included in the recall sample account set are in a state of loss for a first period of time and become recall accounts for a second period of time. Assume that 20201101-20201107 (first period) 100 accounts that are not logged into the target application are taken as attrition sample accounts, and are in an attrition state in the above first period (20201101-20201107). If 30 accounts re-log into the target application at 20201108-20201208 (second time period), it is determined that the 30 accounts become recall accounts. The recall sample account feature set includes a plurality of features of the recall account, such as the basic attribute features described above, active features, payment features, play features, in-game social features, and the like. The key features may be features commonly possessed by recall accounts, that is, features possessed by most recall accounts. Assuming that statistics find that most recall accounts have characteristics that the loss time is less than a preset time length and the historical payment is greater than a preset threshold, the preset time length and the preset threshold can be determined according to practical situations, for example, the preset time length can be 10 hours, 20 hours, 30 hours and the like, and the preset threshold can be 500, 600, 1000 and the like. The duration of the loss and the historical payment are determined to be a set of key features and feature values of the duration of the loss and the historical payment of the recall account are obtained, for example, the duration of the loss of a certain recall account is 5 hours and the historical payment is 3000 yuan.
As an alternative embodiment, the account number may have a large number of features due to churn. For example, the clustering effect is more obvious, and the lost sample accounts are clustered by adopting the determined group of key features in the embodiment. The determined set of key features is assumed to include: online time, historical payment amount, churn time, rating. And clustering the loss accounts in the loss sample account set according to the group of key features. FIG. 6 is a schematic diagram of a plurality of clusters according to an alternative embodiment of the present invention, wherein each cluster may include one or more attrition accounts in an attrition sample account set, where the attrition accounts include recalled accounts and non-recalled accounts, white marks are recalled accounts, and gray marks are non-recalled accounts. Assuming that 20201101-20201107 does not log into 100 accounts of the target application as attrition sample accounts, at 20201101-20201107, it is in an attrition state. If 30 accounts re-log into the target application at 20201108-20201208, the 30 accounts are determined to become recall accounts, and the other 70 accounts are determined to be recall sample accounts. The clustering condition corresponding to each cluster may be one or several key features in a set of keywords, and the feature values, and it is assumed that the history payment Fei Jin and the loss duration in fig. 5 are the clustering conditions of the first cluster, the online duration and the level are the clustering conditions of the second cluster, the history payment and the online duration are the clustering conditions of the third cluster, and the loss duration and the level are the clustering conditions of the fourth cluster. By clustering the lost sample accounts according to a group of key features, the cluster clusters to which the recall account and the non-recall account respectively belong and the features of the accounts in the corresponding cluster clusters can be intuitively determined because each cluster has a corresponding clustering condition. An image of a loss sample account can be determined, as shown in fig. 7, which is an illustration of an image radar according to an alternative embodiment of the present invention, and features of different types of accounts can be clearly distinguished from the image. In this embodiment, assuming that the number of recall accounts in the first cluster is large, determining that the cluster condition (the historical payment amount and the loss duration) corresponding to the first cluster is the feature of the recall account, and by counting the historical payment and the loss duration of the recall account, determining the preset duration and the preset threshold, assuming that the historical payment amount of the recall account is greater than 1000 yuan and the loss duration is less than 5 hours, determining that the preset duration is 5 hours and the preset threshold is 1000 yuan. Therefore, the lost account number with the historical payment amount being more than 1000 yuan and the lost time being less than 5 hours can be easily recalled. In this embodiment, the lost accounts in the lost sample account set are clustered by the determined set of key features, so that the recalled accounts and the non-recalled accounts in the lost sample account set can be more obviously distinguished.
As an alternative embodiment, the clustering operation may be performed using Kmeans clusters or Kmeans++. The Kmeans clustering algorithm is a cluster analysis algorithm for iterative solution, which can divide the lost accounts in the lost account set to be identified into K groups, randomly select K lost accounts in the lost account set to be identified as initial cluster centers, then calculate the distance between each lost account in the lost account set to be identified and each cluster center, and allocate each lost account to the cluster center closest to the cluster center, wherein the cluster center and the lost account allocated to the cluster center represent a cluster. And (4) each loss account is allocated, and the clustering center is recalculated according to the existing loss accounts in the clustering cluster. This process will be repeated until the preset termination condition is met. The preset termination condition may be that the cluster center in the cluster is no longer changed, or that the sum of squares of errors is locally minimal, or that no attrition accounts are reassigned to different clusters.
Wherein, the Kmeans clustering algorithm may comprise the following steps:
step S11, the loss sample account set comprises M loss sample accounts, and K points can be randomly selected from the loss sample account set to serve as initial clustering centers.
Step S12, calculating the distance between each lost sample account and each clustering center, such as Euclidean distance or cosine distance, of the lost sample account set, and dividing the lost sample accounts into clusters corresponding to the closest clustering centers.
Step S13, after all the lost sample accounts in the lost sample account set are attributed to the corresponding cluster, dividing M lost sample accounts into K cluster.
And S14, recalculating the cluster center of each cluster through the average distance center of each cluster, and repeatedly executing the steps S12 and S13 until the termination condition is met, so as to obtain a final clustering result. The suspension condition may be that the cluster center in the cluster is no longer changed, or that the square sum of errors is locally minimal, or that no attrition accounts are reassigned to different clusters.
As an alternative embodiment, taking the above-mentioned object clustering model as a kmeans++ clustering algorithm as an example, the kmeans++ clustering algorithm is an improvement of the Kmeans algorithm. The kmeans++ clustering algorithm may include the steps of:
step S21, randomly selecting a loss sample account from the loss sample account set as a first cluster center mu 1
Step S22, calculating other lost sample accounts to the first clustering center mu 1 Is respectively written as: [ d1, d2, d3, ].]Note d=d1+d2+d3+ …;
step S23, each loss sample account is selected as the second cluster center μ 2 The probabilities of (a) are respectively
Figure BDA0003127211670000161
Step S24, randomly selecting one of the loss sample accounts as a second clustering center mu according to the probability 2 . Specifically, the attrition sample account with the highest probability can be selected as the second clustering center;
in step S25, the distances between the other samples and the first and second polymer centers are calculated, based on the shortest distances. The above step 23 is repeated, step 24, to calculate the remaining cluster centers.
Optionally, the determining a set of key features and feature values of the set of key features according to the recall sample account feature set includes: when the number of the features meeting the current feature condition in the recall sample account feature set is larger than a preset threshold, determining the current feature corresponding to the current feature condition as a key feature, and determining the feature value of the current feature corresponding to the current feature condition as the feature value of the key feature, wherein the current feature condition is a logic judgment condition executed on the current feature and the feature value of the current feature.
As an optional implementation manner, the features of the recall sample account include a plurality of features of each recall sample account, and a set of key features is determined from the plurality of features of the recall sample. The current feature is a feature of the recall sample, for example, a historical payment amount, a loss duration, and the like, and the feature value may be a feature value of the recall sample in the current feature, for example, the historical payment amount is 100 yuan, and the loss duration is 5 hours. The logic determination condition may be greater than a preset value. For example, the current characteristic condition may be a historical payment amount greater than 100 yuan and a churn time period less than 5 hours. The logic judgment condition is that the time is more than 100 yuan and less than 5 hours. And if the number of the features meeting the current feature conditions in the recall sample feature set is larger than a preset threshold, determining the features as key features. For example, if the number of features in the return sample feature set, in which the historical payment amount is greater than 100 yuan and the loss time period is less than 5 hours, is greater than a preset threshold, the historical payment amount and the loss time period are determined to be key features, and the preset threshold may be, for example, 100, 200, 1000, etc. according to practical situations. In this embodiment, the feature importance may be ranked by the LightGBM model, as in fig. 3, and in the process of training the LightGBM model, the LightGBM model may output the feature importance ranking result, and select, for example, 5 or 10 features ranked first as key features. The key features are determined according to the number of the features meeting the current feature conditions in the feature set of the recall sample account, the key features can be determined in a plurality of features of the recall account, the lost sample account is clustered by adopting the key features, and the recall account and the non-recall account can be distinguished obviously.
Optionally, the clustering operation is performed on the loss account set in a plurality of predetermined clusters according to the loss account feature set, so as to obtain a first loss account subset in the loss account set, including: acquiring the key features and the corresponding feature values of each loss account from the loss account feature set; and clustering each loss account in the loss account set into the clustering clusters according to the acquired key features and corresponding feature values of each loss account.
As an alternative embodiment, a set of key features is obtained through the attrition sample accounts, and a plurality of clusters are obtained by clustering the attrition sample accounts through a set of key features. And for the loss account set to be identified, acquiring the feature value corresponding to the key feature of each loss account from the loss account feature set corresponding to the loss account set to be identified, and clustering each loss account into a plurality of cluster clusters clustered by the loss sample accounts according to the feature value corresponding to the key feature of each loss account. For example, the clustering condition of the first cluster in fig. 5 is a historical payment and a loss duration, and if the account to be identified that is lost meets the clustering condition of the first cluster, the account to be lost is clustered into the first cluster. In this embodiment, according to the feature values of the key features of the attrition account to be identified, the attrition account to be identified may be clustered into a plurality of predetermined clusters, and according to the clustering result of the attrition account to be identified, it may be determined whether the attrition account to be identified is an account that is easy to recall. And if the number of recalled accounts in the lost accounts in the first cluster is large and the lost accounts to be identified are clustered into the first cluster, determining that the lost accounts to be identified are accounts which are easy to recall.
Optionally, performing clustering operation on the loss account set in a plurality of predetermined clusters according to the loss account feature set to obtain a first loss account subset in the loss account set, including: determining the distance between each loss account and the cluster core of the target cluster according to the characteristics of each loss account, wherein the target cluster is the cluster with the highest recall probability in the clusters; and determining the first loss account subset from the loss account set according to the distance between each loss account and the cluster core of the target cluster, wherein the distance corresponding to the loss account in the first loss account subset meets a preset distance condition.
As an optional implementation manner, assuming that the recall account number in the first cluster is the largest or the ratio of the recall account number in the first cluster to the current loss account number in the first cluster is the largest, determining the first cluster as the target cluster with the highest recall probability. The cluster core is the cluster center of the target cluster, for the lost accounts to be identified, the distance between each lost account to be identified and the cluster center of the target cluster can be determined, and the lost accounts which are easy to recall can be determined from the lost account set to be identified according to the distance, wherein the accounts in the first stream lost account subset are the lost accounts which are easy to recall. The above-mentioned preset distance condition may be that the distance is smaller than a preset distance, and the preset distance may be determined according to practical situations, for example, 5, 10, 15, etc. In this embodiment, by means of the distance between the attrition account to be identified and the clustering center of the target cluster, it may be determined whether the attrition account to be identified is an account that is easy to recall, and the set of accounts that are easy to recall is the first subset of the attrition accounts.
Optionally, the determining a third subset of the current loss accounts to be recalled according to the first subset of the current loss accounts and the second subset of the current loss accounts includes: and determining the union of the first and second subsets of the fluid loss accounts as the third subset of the fluid loss accounts.
As an optional implementation manner, the first current loss account subset is an account which is easy to recall and determined by a clustering algorithm, the second current loss account subset is an account which is easy to recall and determined by a neural network algorithm, and the first current loss account subset and the second current loss account subset are combined to obtain a current loss account which is easy to recall in the account set to be lost, and a third current loss account subset is obtained.
Optionally, after determining the subset of third fluid loss accounts to recall, the method further comprises: searching an active account related to the third loss account subset in the account set of the target application to obtain an active account set, wherein the active account set comprises accounts currently in an active state in the target application, and the correlation degree between the active account in the active account set and the corresponding loss account in the third loss account subset meets a fourth preset condition; and sending preset recall information to the third fluid loss account subset through the active account set.
As an optional implementation manner, the active account may be an account currently logged in the target application, an account recently logged in to the target application, or an account recently logged in to the target application more frequently. The recent period may be, for example, about 1 day or about 1 hour, depending on the actual situation. The frequency may be determined according to the actual situation, for example, 10 times of login for 1 day. The degree of relatedness may be a degree of relatedness of the friends, which may be based on the number of communications or the number of plays in the same game. The fourth preset condition may be determined according to practical situations, for example, the number of exchanges is greater than 100, and the number of times of referring to the same game exceeds 50. And taking the active account meeting the fourth preset condition as an intimate friend of the lost account to be identified, and sending recall information to the lost account through the intimate friend under the condition that the lost account is determined to be the account which is easy to recall so as to recall the lost account.
Optionally, the sending, by the active account set, preset recall information to the third fluid loss account subset includes: according to the characteristics of each active account in the active account set, clustering operation is carried out on the active account set to obtain a target clustering result, wherein the target clustering result comprises account categories to which each active account in the active account set belongs; setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set; and sending preset recall information corresponding to the account category to which each active account belongs to the corresponding loss account in the third loss account subset through each active account in the active account set.
As an optional implementation manner, the target clustering model may be a Kmeans++ clustering model, and clustering operation is performed on each active account in the active account set through the Kmeans++ clustering model. The clustering result includes account categories of active accounts, including but not limited to core players, drop-out players. Setting personalized recall information according to account types of the active accounts, as shown in fig. 8, which is a schematic view of recall flow according to an alternative embodiment of the present invention, wherein a first subset of the current account is obtained through clustering operation, a second subset of the current account is obtained by using a staring model (a fusion model of LightGBM and deep fm), a third subset of the current account which may be recalled is obtained through fusion deduplication, the most intimate active account is searched for each lost account in the third subset of the current account, the category of intimate active account is searched, recall information is set according to the category of the active account, and recall information is sent to the accounts in the third subset of the current account which can be recalled. For example, the recall information set for the core player is "team with XX together with the map, and the recall information set for the losing player is" XX is more recently and somewhat tragic and urgently needed to help ", and by setting personalized recall information, the personalized recall information is sent to the losing account by close friends, so that the recall rate can be improved.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.
According to another aspect of the embodiment of the invention, a recall device for a loss account is also provided, wherein the recall device is used for implementing the recall method for the loss account. As shown in fig. 9, the apparatus includes: an obtaining module 902, configured to obtain a set of attrition accounts to be identified and a corresponding set of attrition account features, where the set of attrition accounts includes an account currently in an attrition state in a target application, and the set of attrition account features includes a plurality of features of each attrition account in the set of attrition accounts; an execution module 904, configured to execute a clustering operation on the loss account set in a plurality of predetermined clusters according to the feature set of the loss account, so as to obtain a first loss account subset in the loss account set, where a cluster to which the first loss account subset belongs is a target cluster whose recall probability satisfies a first preset condition in the plurality of clusters, and a plurality of features of the loss account in the first loss account subset satisfy a clustering condition corresponding to the target cluster; an input module 906, configured to input the feature set of the attrition account into a target neural network model, to obtain a second subset of attrition accounts in the feature set of the attrition account, where the attrition accounts in the second subset of attrition accounts are attrition accounts that belong to a recall account category predicted by the target neural network model; a determining module 908 is configured to determine a third subset of the fluid loss accounts to be recalled according to the first subset of the fluid loss accounts and the second subset of the fluid loss accounts.
Optionally, the device is further configured to input the feature set of the attrition account to a first prediction neural network model in the target neural network model, so as to obtain a first probability set predicted by the first prediction neural network model, where the first probability set includes a probability that each attrition account in the attrition account set belongs to the recall account category, and the first prediction neural network model is configured to determine, according to the feature set of the attrition account, a probability that each attrition account belongs to the recall account category; inputting the loss account feature set and the first probability set into a second prediction neural network model in the target neural network model to obtain a second probability set predicted by the second prediction neural network model, wherein the second probability set comprises the probability that each loss account in the loss account set belongs to the recall account category, and the second prediction neural network model is used for determining 2-order cross features and higher than 2-order cross features according to the loss account feature set and the first probability set, and determining the probability that each loss account belongs to the recall account category according to the 2-order cross features and the higher than 2-order cross features; and determining the second loss account subset in the loss account set according to the first probability set and the second probability set.
Optionally, the device is further configured to determine a final prediction probability that each attrition account in the attrition account set belongs to the recall account category according to the first probability set and the second probability set, where the final prediction probability of each attrition account is a mean value of probabilities corresponding to each attrition account in the first probability set and the second probability set; and searching the loss account number with the final prediction probability larger than a preset threshold value in the loss account number set to obtain the second loss account number subset.
Optionally, the device is further configured to obtain a feature set of a lost sample account corresponding to a lost sample account set and a corresponding actual labeling result set, where the lost sample account set includes an account in a lost state in the target application in a preset first time period, the feature set of the lost sample account includes a plurality of features of each lost account in the lost sample account set, and each labeling result in the actual labeling result set indicates whether the corresponding lost account in the lost sample account set actually becomes a recall account in a preset second time period, where the second time period is later than the first time period; training a first neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first neural network model to be trained and the actual labeling result set meets a second preset condition, so as to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
Optionally, the device is further configured to train a second neural network model to be trained by using the feature set of the lost sample account, the actual labeling result set and the first prediction labeling result set, until a loss function between the second prediction labeling result set output by the second neural network model to be trained and the actual labeling result set meets a third preset condition, so as to obtain the second prediction neural network model, where each labeling result in the second prediction labeling result set represents a probability that a corresponding lost account in the predicted lost sample account set becomes a recall account in the second time period.
Optionally, the device is further configured to obtain a recall sample account feature set corresponding to the recall sample account set, where each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, and the recall sample account feature set includes a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period; determining a group of key features and feature values of the group of key features according to the recall sample account feature set; and clustering lost accounts in a lost sample account set by using the set of key features and the feature values of the set of key features to obtain the plurality of clustering clusters and clustering conditions corresponding to each clustering cluster, wherein the clustering conditions corresponding to each clustering cluster comprise one or more key features and the corresponding feature values in the set of key features, the lost sample account set comprises the recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
Optionally, the device is further configured to determine, when the number of features satisfying a current feature condition in the recall sample account feature set is greater than a preset threshold, a current feature corresponding to the current feature condition as a key feature, and determine, as a feature value of the key feature, a feature value of the current feature corresponding to the current feature condition, where the current feature condition is a logic judgment condition executed on the current feature and the feature value of the current feature.
Optionally, the device is further configured to obtain the set of key features and the corresponding feature values of each of the attrition accounts from the feature set of the attrition accounts; and clustering each loss account in the loss account set into the clustering clusters according to the acquired key features and corresponding feature values of each loss account.
Optionally, the device is further configured to determine, according to a plurality of features of each attrition account, a distance between each attrition account and a cluster core of the target cluster, where the target cluster is a cluster with a highest recall probability among the plurality of clusters; and determining the first loss account subset from the loss account set according to the distance between each loss account and the cluster core of the target cluster, wherein the distance corresponding to the loss account in the first loss account subset meets a preset distance condition.
Optionally, the above device is further configured to determine a union of the first subset of the fluid loss accounts and the second subset of the fluid loss accounts as the third subset of the fluid loss accounts.
Optionally, after determining the third loss account subset to be recalled, the device is further configured to search an active account related to the third loss account subset in the account set of the target application to obtain an active account set, where the active account set includes accounts currently in an active state in the target application, and a correlation degree between the active accounts in the active account set and the corresponding loss accounts in the third loss account subset meets a fourth preset condition; and sending preset recall information to the third fluid loss account subset through the active account set.
Optionally, the device is further configured to perform a clustering operation on the active account set according to a feature of each active account in the active account set, so as to obtain a target clustering result, where the target clustering result includes an account category to which each active account in the active account set belongs; setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set; and sending preset recall information corresponding to the account category to which each active account belongs to the corresponding loss account in the third loss account subset through each active account in the active account set.
According to still another aspect of the embodiment of the present invention, there is further provided an electronic device for implementing the recall method of the above-mentioned attrition account, where the electronic device may be a terminal device or a server as shown in fig. 1. The present embodiment is described taking the electronic device as a server as an example. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.
Alternatively, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of the computer network.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, acquiring a to-be-identified loss account set and a corresponding loss account feature set, wherein the loss account set comprises accounts which are currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set;
s2, according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of predetermined clusters, and a first lost account subset in the lost account set is obtained, wherein the cluster to which the first lost account subset belongs is a target cluster, the recall probability of which meets a first preset condition, in the plurality of clusters, and a plurality of features of the lost account in the first lost account subset meet the clustering condition corresponding to the target cluster;
S3, inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model;
s4, determining a third loss account subset to be recalled according to the first loss account subset and the second loss account subset.
Alternatively, as will be appreciated by those skilled in the art, the structure shown in fig. 10 is merely illustrative, and the electronic device may be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palmtop computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, or other terminal devices. Fig. 10 is not limited to the structure of the electronic device and the electronic apparatus described above. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be used to store software programs and modules, such as program instructions/modules corresponding to the recall method and apparatus for a loss account in the embodiment of the present invention, and the processor 1004 executes the software programs and modules stored in the memory 1002 to perform various functional applications and data processing, that is, implement the recall method for a loss account described above. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may specifically, but not limited to, store information such as sample characteristics of the item and the target virtual resource account number. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, an acquisition module 902, an execution module 904, an input module 906, and a determination module 908 in a recall device including the attrition account. In addition, other module units in the recall device for the lost account may be included, but are not limited to, and are not described in detail in this example.
Optionally, the transmission device 1006 is configured to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 1006 includes a network adapter (Network Interface Controller, NIC) that can be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 1006 is a Radio Frequency (RF) module for communicating with the internet wirelessly.
In addition, the electronic device further includes: a display 1008 for displaying the order information to be processed; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting the plurality of nodes through a network communication. Among them, the nodes may form a Peer-To-Peer (P2P) network, and any type of computing device, such as a server, a terminal, etc., may become a node in the blockchain system by joining the Peer-To-Peer network.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the methods provided in the various alternative implementations described above. Wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a to-be-identified loss account set and a corresponding loss account feature set, wherein the loss account set comprises accounts which are currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set;
s2, according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of predetermined clusters, and a first lost account subset in the lost account set is obtained, wherein the cluster to which the first lost account subset belongs is a target cluster, the recall probability of which meets a first preset condition, in the plurality of clusters, and a plurality of features of the lost account in the first lost account subset meet the clustering condition corresponding to the target cluster;
S3, inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model;
s4, determining a third loss account subset to be recalled according to the first loss account subset and the second loss account subset.
Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (RandomAccess Memory, RAM), magnetic or optical disk, and the like.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (14)

1. A recall method for a lost account, comprising:
acquiring a to-be-identified loss account set and a corresponding loss account feature set, wherein the loss account set comprises accounts currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set;
according to the lost account feature set, clustering operation is carried out on the lost account set in a plurality of preset clustering clusters to obtain a first lost account subset in the lost account set, wherein the clustering cluster to which the first lost account subset belongs is a target clustering cluster, the recall probability of which meets a first preset condition, in the plurality of clustering clusters, and the plurality of features of the lost accounts in the first lost account subset meet the clustering condition corresponding to the target clustering cluster;
Inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are loss accounts which belong to recall account categories and are predicted by the target neural network model, and the probability that the loss accounts belong to the recall account categories and the probability that the loss accounts are recalled are in positive correlation;
the first fluid loss account subset and the second fluid loss account subset are subjected to de-duplication to obtain a third fluid loss account subset, and preset recall information is sent to the third fluid loss account subset;
the method further comprises the steps of:
obtaining a recall sample account feature set corresponding to the recall sample account set, wherein each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, the recall sample account feature set comprises a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period;
determining a group of key features and feature values of the group of key features according to the recall sample account feature set;
And clustering lost accounts in a lost sample account set by using the set of key features and the feature values of the set of key features to obtain the plurality of clustering clusters and clustering conditions corresponding to each clustering cluster, wherein the clustering conditions corresponding to each clustering cluster comprise one or more key features and the corresponding feature values in the set of key features, the lost sample account set comprises the recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
2. The method of claim 1, wherein inputting the set of attrition account features into a target neural network model results in a second subset of attrition accounts in the set of attrition accounts, comprising:
inputting the loss account feature set into a first prediction neural network model in the target neural network model to obtain a first probability set predicted by the first prediction neural network model, wherein the first probability set comprises the probability that each loss account in the loss account set belongs to the recall account class, and the first prediction neural network model is used for determining the probability that each loss account belongs to the recall account class according to the loss account feature set;
Inputting the loss account feature set and the first probability set into a second prediction neural network model in the target neural network model to obtain a second probability set predicted by the second prediction neural network model, wherein the second probability set comprises the probability that each loss account in the loss account set belongs to the recall account category, and the second prediction neural network model is used for determining 2-order cross features and higher than 2-order cross features according to the loss account feature set and the first probability set, and determining the probability that each loss account belongs to the recall account category according to the 2-order cross features and the higher than 2-order cross features;
and determining the second loss account subset in the loss account set according to the first probability set and the second probability set.
3. The method of claim 2, wherein the determining the second subset of loss accounts in the set of loss accounts according to the first set of probabilities and the second set of probabilities comprises:
determining the final prediction probability of each loss account in the loss account set belonging to the recall account category according to the first probability set and the second probability set, wherein the final prediction probability of each loss account is the average value of probabilities corresponding to each loss account in the first probability set and the second probability set;
And searching the loss account number with the final prediction probability larger than a preset threshold value in the loss account number set to obtain the second loss account number subset.
4. The method according to claim 2, wherein the method further comprises:
obtaining a corresponding lost sample account feature set and a corresponding actual labeling result set of a lost sample account set, wherein the lost sample account set comprises accounts which are in a lost state in a preset first time period in the target application, the lost sample account feature set comprises a plurality of features of each lost account in the lost sample account set, each labeling result in the actual labeling result set represents whether the corresponding lost account in the lost sample account set actually becomes a recall account in a preset second time period, and the second time period is later than the first time period;
training a first neural network model by using the loss sample account feature set and the actual labeling result set until a loss function between a first prediction labeling result set output by the first neural network model to be trained and the actual labeling result set meets a second preset condition, so as to obtain the first prediction neural network model, wherein each labeling result in the first prediction labeling result set represents the probability that a corresponding loss account in the predicted loss sample account set becomes a recall account in the second time period.
5. The method according to claim 4, wherein the method further comprises:
training a second neural network model to be trained by using the lost sample account feature set, the actual labeling result set and the first prediction labeling result set until a loss function between the second prediction labeling result set output by the second neural network model to be trained and the actual labeling result set meets a third preset condition, so as to obtain the second prediction neural network model, wherein each labeling result in the second prediction labeling result set represents the probability that a corresponding lost account in the predicted lost sample account set becomes a recall account in the second time period.
6. The method of claim 1, wherein determining a set of key features and feature values for the set of key features from the recall sample account feature set comprises:
when the number of the features meeting the current feature condition in the recall sample account feature set is larger than a preset threshold, determining the current feature corresponding to the current feature condition as a key feature, and determining the feature value of the current feature corresponding to the current feature condition as the feature value of the key feature, wherein the current feature condition is a logic judgment condition executed on the current feature and the feature value of the current feature.
7. The method according to claim 1, wherein the clustering the missing account set in a predetermined plurality of clusters according to the missing account feature set to obtain a first subset of missing accounts in the missing account set, includes:
acquiring the key features and the corresponding feature values of each loss account from the loss account feature set;
and clustering each loss account in the loss account set into the clustering clusters according to the acquired key features and corresponding feature values of each loss account.
8. The method of claim 1, wherein performing a clustering operation on the set of attrition accounts in a predetermined plurality of clusters based on the set of attrition account characteristics to obtain a first subset of attrition accounts in the set of attrition accounts comprises:
determining the distance between each loss account and the cluster core of the target cluster according to the characteristics of each loss account, wherein the target cluster is the cluster with the highest recall probability in the clusters;
And determining the first loss account subset from the loss account set according to the distance between each loss account and the cluster core of the target cluster, wherein the distance corresponding to the loss account in the first loss account subset meets a preset distance condition.
9. The method according to any one of claims 1 to 8, wherein obtaining a third subset of the fluid loss accounts by de-duplicating the first subset of the fluid loss accounts and the second subset of the fluid loss accounts, and sending preset recall information to the third subset of the fluid loss accounts, comprising:
and determining the union of the first and second subsets of the fluid loss accounts as the third subset of the fluid loss accounts.
10. The method of any one of claims 1 to 8, wherein after obtaining the third subset of fluid loss accounts to be recalled, the method further comprises:
searching an active account related to the third loss account subset in the account set of the target application to obtain an active account set, wherein the active account set comprises accounts currently in an active state in the target application, and the correlation degree between the active account in the active account set and the corresponding loss account in the third loss account subset meets a fourth preset condition;
And sending preset recall information to the third fluid loss account subset through the active account set.
11. The method of claim 10, wherein the sending, by the active account set, preset recall information to the third subset of lost accounts comprises:
according to the characteristics of each active account in the active account set, clustering operation is carried out on the active account set to obtain a target clustering result, wherein the target clustering result comprises account categories to which each active account in the active account set belongs;
setting recall information corresponding to the account category to which each active account belongs for each active account in the active account set;
and sending preset recall information corresponding to the account category to which each active account belongs to the corresponding loss account in the third loss account subset through each active account in the active account set.
12. A recall device for a lost account, comprising:
the system comprises an acquisition module, a verification module and a verification module, wherein the acquisition module is used for acquiring a to-be-identified loss account set and a corresponding loss account feature set, the loss account set comprises accounts which are currently in a loss state in a target application, and the loss account feature set comprises a plurality of features of each loss account in the loss account set;
The execution module is used for executing clustering operation on the loss account set in a plurality of preset clusters according to the loss account feature set to obtain a first loss account subset in the loss account set, wherein the cluster to which the first loss account subset belongs is a target cluster of which recall probability in the plurality of clusters meets a first preset condition, and a plurality of features of the loss account in the first loss account subset meet the clustering condition corresponding to the target cluster;
the input module is used for inputting the loss account feature set into a target neural network model to obtain a second loss account subset in the loss account set, wherein the loss accounts in the second loss account subset are the loss accounts which belong to the recall account category and are predicted by the target neural network model, and the probability that the loss accounts belong to the recall account category and the probability that the loss accounts are recalled are in positive correlation;
the determining module is used for obtaining a third fluid loss account subset by de-duplicating the first fluid loss account subset and the second fluid loss account subset, and sending preset recall information to the third fluid loss account subset;
The device is also for:
obtaining a recall sample account feature set corresponding to the recall sample account set, wherein each account in the recall sample account set is in a loss state in a preset first time period and becomes a recall account in a preset second time period, the recall sample account feature set comprises a plurality of features of each account in the recall sample account set, and the second time period is later than the first time period;
determining a group of key features and feature values of the group of key features according to the recall sample account feature set;
and clustering lost accounts in a lost sample account set by using the set of key features and the feature values of the set of key features to obtain the plurality of clustering clusters and clustering conditions corresponding to each clustering cluster, wherein the clustering conditions corresponding to each clustering cluster comprise one or more key features and the corresponding feature values in the set of key features, the lost sample account set comprises the recall sample account set and an unrecalled sample account set, and each account in the unrecalled sample account set is in a lost state in the first time period and does not become a recall account in the second time period.
13. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any one of claims 1 to 11.
14. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 11 by means of the computer program.
CN202110693747.XA 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment Active CN113244629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110693747.XA CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110693747.XA CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN113244629A CN113244629A (en) 2021-08-13
CN113244629B true CN113244629B (en) 2023-05-12

Family

ID=77189239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110693747.XA Active CN113244629B (en) 2021-06-22 2021-06-22 Recall method and device for lost account, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113244629B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115624755B (en) * 2022-12-08 2023-03-14 腾讯科技(深圳)有限公司 Data processing method and device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110585726B (en) * 2019-09-16 2023-04-07 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium
CN111275503B (en) * 2020-03-20 2023-12-05 京东科技控股股份有限公司 Data processing method and device for obtaining recall success rate of lost user

Also Published As

Publication number Publication date
CN113244629A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN104778173B (en) Target user determination method, device and equipment
CN107368891A (en) A kind of compression method and device of deep learning model
CN107025228B (en) Question recommendation method and equipment
CN110119477A (en) A kind of information-pushing method, device and storage medium
CN112307239B (en) Image retrieval method, device, medium and equipment
CN110209809B (en) Text clustering method and device, storage medium and electronic device
WO2016165414A1 (en) Method and device for push information
CN111701247A (en) Method and equipment for determining unified account
CN113244629B (en) Recall method and device for lost account, storage medium and electronic equipment
CN111260220A (en) Group control equipment identification method and device, electronic equipment and storage medium
CN110162769B (en) Text theme output method and device, storage medium and electronic device
CN113301017B (en) Attack detection and defense method and device based on federal learning and storage medium
CN114490923A (en) Training method, device and equipment for similar text matching model and storage medium
CN111767419B (en) Picture searching method, device, equipment and computer readable storage medium
CN107679097B (en) Distributed data processing method, system and storage medium
CN114332550A (en) Model training method, system, storage medium and terminal equipment
CN113590898A (en) Data retrieval method and device, electronic equipment, storage medium and computer product
CN115994321A (en) Object classification method and related device
CN111310072B (en) Keyword extraction method, keyword extraction device and computer-readable storage medium
CN112819499A (en) Information transmission method, information transmission device, server and storage medium
CN111709473A (en) Object feature clustering method and device
WO2021047021A1 (en) Information mining method and apparatus, device, and storage medium
CN111582967A (en) Content search method, device, equipment and storage medium
CN112929348B (en) Information processing method and device, electronic equipment and computer readable storage medium
CN111935259B (en) Method and device for determining target account set, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40052201

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant